My Software Notes

Useful things I discover

Sharding Notes – Splitting Tables

with one comment

We hit some tables where some records applied to only one shard but others applied to multiple shards.

As an example, one of the tables was Users.  The database is sharded on the Companies table.  Many users belong to only one company (e.g. employees of a company) but there were some (such as in-house support people and super-admins) who belong to multiple companies.

The possible solutions we came up with were:

  1. Split the tables according to the different records (Table per Type or Table per Concrete Class).  For example, keep the Users table for people who belong to only one company and create an AdminUsers table for those who belong to multiple companies.
  2. Add a discriminator column (e.g., UserType with values of ‘Regular’ or ‘Admin’) and handle the records according to the type (Table per Hierarchy).
  3. Move the split table that applies to multiple shards into a “master” database so it is not on the shards and query it there.  E.g., we split Users into Users and AdminUsers and move AdminUsers onto the master database. 
  4. When loading a user do a cross-shard query to get all the companies/shards the entity is on.  Then do all further CRUD operations on the entity by hitting the specific shards you found in the first query.  E.g., User #5 administers company #7 on shard 1, company #10 on shard 2 and company #15 on shard 3.  He does not administer any companies on shards 4 or 5.  Further CRUD for that user will hit shards 1, 2 and 3.

My preference is #1, but I think we’ll be going with #4 on all the “special” tables because the amount of extra work in implementing #1 right now is way more than we have the time or  resources for.  Perhaps once we have the sharded system up and running we may switch to #1, but that is in the future and a lot depends on the performance for #4.  Luckily most of the “special” tables (other than User) are not high traffic, so we’re hoping it won’t be a problem.

Of course your own situation will probably be different so you should evaluate the pros and cons of these methods for your particular situation.   And you may come up with other ways to do it.  If you do, please leave a comment describing your solution.

Good Sharding Article

Here is a good intro article on sharding: Database Sharding.


Written by gsdwriter

June 1, 2009 at 2:36 pm

Posted in Database Sharding

Tagged with ,

One Response

Subscribe to comments with RSS.

  1. […] Next:  Sharding Notes – Splitting Tables Possibly related posts: (automatically generated)Database Sharding Basics and User Rank CalculationThe Queue: Soul Shard parade […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: