Archive for the ‘Design’ Category
Okay, so I did know what TDD was. I’ve read about it. I’ve (sort-of) tried it out. But this morning I was listening to a Hanselminute podcast about TDD and I finally got what it really is. It is NOT “Test” Driven Development. The tests are there only because we happen to use a unit testing tool to help us apply this method of software development.
It really should be called something like “Harness” Driven Development or “Wrapper” Driven Development because what we are doing is creating a software exercising harness to help us design and develop our applications.
The side-effect of having a bunch of unit-tests so we can have confidence in the correctness of our code and so we can quickly test later changes, is great, but it is a side-effect and not the purpose of TDD.
Example: I need view models for the customer related views in an ASP.NET MVC app. So, to figure out the design of how I’m going to do this, I create a class in my “test” project with a method containing the functionality to get one of these view models: I have it instantiate an object of the class that will have this responsibility and make a call to a method on the class that will return one of the view models I need. This is the first step of figuring out the class and the api it will have. Now I can continue with the other steps of TDD. What I am doing here is NOT testing. It is designing and developing in an interactive manner. It gives me a quick turnaround because I don’t have to fire up the entire application to see if what I just did works, I can run just that single “test”.
It’s not practical to rename TDD to something else because the term is so ubiquitous, but from now on I’m going to name my “test” project “DevWrappers” or “DesignWrappers” or something similar to help me keep in mind that I’m NOT testing, I’m developing.
And now I understand that the “Test” in TDD is a misnomer, I think I will be using it a lot more.
An alternative title for this article could be “When is a Data Access Layer not a Data Access Layer?”
In the last couple of years I’ve come across a “pattern” that is used for database access that I think deserves the title of anti-pattern. I’m going to call it the “DB Class” pattern because the examples of it I’ve come across have used the name “DB”, “DBCommon” and similar to name the class.
The way it works is that the class has methods that are passed the name of a stored procedure or a sql statement and an array of parameters. The methods generally return data tables or scalar values or data sets.
Here is an example of a call to such a method:
DataTable customers = DB.GetTable("dbo.GetCustomers", params);
Where “dbo.GetCustomers” is the name of a stored procedure and “params” is a SQLParameter array.
What’s wrong with this pattern is that the calling code has to know the structure of the database – name of stored procedure, names and types of the parameters. A true Data Access Layer hides the database from the other parts of the application.
A better example of a call to such a method would be:
DataTable customers = DB.GetCustomers(nameStartsWith, custType);
Where the method parameters are standard .NET data types, not SQLParameters.
An even better example would be:
List<Customer> customers = DB.GetCustomers(nameStartsWith, custType);
Where a collection of objects is returned. This would be using a Transaction Script pattern.
I’ve seen this “DB Class pattern” used in two ways:
- The methods are called directly from all over the place. For example, in the code-behind of an aspx page or in an event handler of a Windows Form.
- The methods are called by a “data access layer”. For example, in the code-behind a call is made to a method in a data access class, then the data access class makes the call to DB.
The 1st way exposes the database structure to the harsh light of day – not a good idea. Try to make some database changes or change the parameters of a stored procedure and you will be doing text searches and endless regression testing to make sure you didn’t break something.
The 2nd way is not as bad, at least the stored procedure and the parameters are hidden from the UI code but these methods return generic DataTable objects and then the code that made the call has to know the names of the table columns, which brings us back to the same issue as before – code outside the data access layer has to know the database structure.
If you really have to use DataSets and DataTables for passing your data around then use Typed DataSets. Visual Studio has designers and wizards that make it easy and they also create a “TableAdapter” class that encapsulates your database interactions.
Better still is to use a Domain Model pattern and an ORM tool that converts data from a database into objects and collections of objects. NHibernate, LLBLGen and now Entity Framework are just three of the many choices available.
So, to conclude, I think this “DB Class” is an anti-pattern, and if no one else has named it as such then I claim that distinction and all the honors antendent to it 🙂
More on Anti-Patterns:
I’ve never had to deal with a SQL Injection attack before, mainly because in all the sites I’ve ever created I pass any input from the site to the database in parametrized form. I have never built SQL statements on the fly by adding user supplied values in strings.
But this week a new client received such an attack which brought down the corporate web site. So I had to get involved and figure out what to do.
When I looked into the database I found all character fields were filled with “”. (BTW, if you Google that string you will find it all over the place, so the incompetent programmer who created my client’s site is not alone.)
The attacker managed to inject the script because:
1. The guy who created the site put inline sql all over the place. Here is a sample:
sql = "select * from content where id = '" & Request.QueryString("id") & "'"
2. He never checked any user input or any query string for dangerous content.
3. When he set up the database he gave admin privileges to the database userid the web site used.
That was his side of the story. The attackers side went like this. They used the query string to inject a huge sql statement that they hid in a hex encoded number.
page.aspx?id=29;DECLARE%20@S%20NVARCHAR(4000);SET%20@S=CAST(0x44 ... 200%20AS%20NVARCHAR(4000));EXEC(@S);--
I cut out the full number because it was almost 2k digits long. This huge string of hex digits was cast into a very clever set of sql statements that opened a cursor on the sysobjects and syscolumns tables and updated every character type column by adding that script tag to it.
One clever thing about this is that all of the articles I read when I was researching what to do mentioned looking out for strings such as “select”, “drop”, “insert” and “update” but not one mentioned “declare”, “set” or “cast”. Of course catching the “;” would have prevented the attack and most articles did mention that.
So, how were the attacks handled?
I created a function that handled the single quote issue and nuked the naughty words(like “declare”, “select”, etc.). Then me and two other developers went through the 500+ locations in the code where the !@#%$^ who created the site had built his sql statements by concatenating strings with variables and we wrapped every variable and every Request.QueryString(“paramname”) in a call to the function. Tedious in the extreme, but necessary.
And it worked. I know that because they have tried the attack several times since (robots never give up) and it has not succeeded. So when you do a Google search for that script tag you won’t find my client’s web site in the results.
So, how do you avoid SQL Injection Attacks?
- Never build sql statements by concatenating strings and variables.
- Use parameters (such as SQLParameter or OLEDBParameter or whatever the equivalent is in your language of choice) in your database calls.
- Validate user input and URL query strings so you detect and reject potentially dangerous entries.
- Limit how much a user can enter in text boxes so they can’t add on a bunch of sql statements.
- If you are stuck with an existing site that is wide open to attack, then wrap every variable and every use of the URL query string in a call to a sanitizing method.
- Get educated on SQL Injection Attacks. Read the resources below and anything else you need to until you get the idea.
Some resources on SQL Injection Attacks:
I recently re-read SQL Server 2005 Bible, a great book that covers everything you ever wanted to know about SQL Server but were afraid to ask. At one point in the book the author says that your data abstraction layer should be in your database in the form of stored procedures and he gives some good arguments as to why.
Right after that I read Pro LINQ Object Relational Mapping in C# 2008, another excellent book that shows you how to create an ORM using LINQ (both LINQ to SQL and Entity Framework) . When introducing the fact that you can use stored procedures in LINQ, the author very definitely says that even though they are supported, you shouldn’t use them and he gives some good arguments as to why.
Two diametrically opposing views in two really good books. The error both authors make is thinking that one size fits all. The truth is that “all” can vary widely so one size just ain’t gonna cut it.
My view is that everything we have in our industry is a tool and each tool has a range of uses.
Example: I recently joined a project that heavily uses stored procedures in line with the “SQL Server Bible” author. I think it is appropriate given the huge amount of data the application has to deal with. Tuning many queries within an inch of their life, including being able to completely rewrite them or split a single query into several, is very necessary on this project. Using LINQ or some other ORM tool to only generate the SQL and never use stored procedures would just not work.
Another Example: I was working on a project a couple years ago where LINQ to SQL would have been perfect for fast creation of a Data Access Layer in an existing application that needed some major enhancements, but unfortunately we had to still support users with Windows 2000 machines and .NET Framework 3.5 doesn’t support W2K, so we were stuck with using typed datasets. But we didn’t use stored procedures for our abstraction layer. It really wasn’t necessary. Yes, some tuning was needed, but the app usage was mild and the amount of data relatively small, even after many years of use, so an ORM would have been great and would have made our lives simpler.
Anyway, the moral of the story is use the tool that is appropriate and remember that absolutes are unobtainable.