My Software Notes

Useful things I discover

Sharding Notes: First rule of database sharding

leave a comment »

As far as I’m concerned, the first rule of database sharding is “You must have a data access layer“.

If you have architected your system in a “client-server” style were data access code is embedded in the client code then the effort required to shard is going to be huge, partially because you have so many places in your code to address and partially because of the vast amount of testing that is going to be required because you have likely touched every source file in your system.

I hit this problem some time ago.  It wasn’t on a sharding project but something similar.  The project was writing the next version of a system that had been written in VB6 by some inexperienced developers.  Data access code and dynamically built SQL statements were all over the place.  Their idea of a data access layer was a “Db” module that had methods for running queries and returning DAO result sets.  E.g.,

Set rs = Db.Execute("Select * from Customers where " _
    + strCol + " = " + strCustName)

And about 20% of the time they would directly call DAO and not user their own “DAL”. Yuck!

We began moving the functionality over to .NET piece by piece.  I created a three-layer style architecture for the .NET pieces: UI Layer, Domain Layer (sometimes called Business Logic Layer) and Data Access Layer.  We used COM Interop to call the .NET code from the VB6 exe.  (I wish I could have created a .NET exe and converted the VB6 part to a DLL but … well that’s a subject for another post.)

After some successful releases the client decided they wanted to move over to using SQL Server instead of MS Access.  We got a SQL Server version of the database created relatively quickly and then we moved on to converting the code to use the new database.

To convert the .NET code took one developer 3 days and that included extensive unit testing.  This was because all data access code was in one place and the Domain and UI Layers knew nothing about the database: not its structure, not its constraints, not its table names, not its column names, nothing.  It would have taken less time if it were not for the fact that we had to tweek some of the Access SQL so it would play nice with SQL Server.

To convert the VB6 code took three developers over two months to get about 75% done (without all the needed testing) and was then put on hold because there was a deadline to get out some high priority new features to the system users. 

Why so long?  It wasn’t the developers, these were smart, hardworking  guys.  No, it was simply the lack of a data access layer.

The same applies to sharding a database.  If you don’t have your data access code isolated in one area then I’d suggest you do that first before you even think of sharding. 

It is probably possible to shard without a data access layer, but then it’s probably possible to stab yourself with a machete and not bleed to death, but do you really want to risk it?

Advertisements

Written by gsdwriter

June 4, 2009 at 11:22 am

Posted in Database Sharding

Tagged with ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: