Showing posts with label Database. Show all posts
Showing posts with label Database. Show all posts

Tuesday, January 17, 2012

Object Versioning is an Open Design Problem


This unsolvable maze is a local food from Bangladesh, known as Jilapi
Photo credits to udvranto pothik
Object Versioning is often required by a business rule, for example, to maintain an audit trail or to be able to revert to a previous version, etc. This is the 3rd time in my career where this Object Versioning requirement made me think like -
There's gotta be an easier solution! 
But, I am yet to find one. So, I am thinking it's one of those open design problems, may be.

To clarify the requirement with an example, let's consider the following scenario:

A lawyer is preparing a document for one of her clients using a software. On January 17th, she needs to take a look at the version of the same document from May last year so that she can backtrace some changes that took place during these months.

Lets assume the lawyer is using a software that stores the documents in a relational database with the following schema.
A Document has many Evidences, each provided by an EvidenceProvider

Document (id, client_id, case_id, name)
Evidence (id, document_id, evidence_provider_id, details)
EvidenceProvider(id, name)
Now, given the versioning requirement how would you design your data model?

Here's a couple of points that your design should address at a minimum:
  • Going back to a version means a complete snapshot of the old version of the document. So, the version of May 1st should only bring the evidences that were there on that very day.
  • As a new version is created, it should inherit all previous evidences.
As I have mentioned earlier, I am yet to find a good data model that can take care of these concerns without over-complicating everything. Let me know if you got a beautiful solution to this problem.

However, in my latest project, the requirement is even harder. It's somewhat like this:

The lawyer may have some documents in the "work in progress version". This means, if she needs to print a document for the court, she only wants to print the last "good version", skipping the "work in progress version".

Also, when there is such a "work in progress version", she needs to attach any new Evidence to both the last "good version" as well as to the "work in progress version".

Well, now you see the design of a data model for Object Versioning becomes really messy and unintuitive.
So, here's my question to you - how would you design a solution for this?

Wednesday, September 28, 2011

The Perils of Soft Delete

Often times, applications cannot get rid of the database records for business rules. For example, if you are a cable TV provider, you might have a customer calling you to stop the National Geographic Channel subscription from next month. Ideally you would like to delete this record, but you can't do it until the effective date:( If you delete it, the next invoice will not be able to charge for this channel, although it's still being used.

In such scenarios, its common to fall back to a soft delete model. As an example, here's a little code:


This code looks simple and harmless at a first sight. But, as you develop your app, you'll run into a lot of issues from this. Here's a short list:
  1. You need to take care of cascading soft delete, for example, cancelling a customer's subscription needs to cascade down to all channels and other objects under it.
  2. Whenever, you are listing the current subscriptions of a customer, you need to filter out the ended ones.
  3. You need to figure out a strategy to periodically clean the ended subscriptions so that your database is not filled with outdated data.
  4. If you have a business key used as a primary key as well, you will be in trouble if an ended subscription is restarted.
  5. Your code will eventually have a lot of if-else blocks to apply changes to only on active objects.
So, what is the solution? Its best to avoid them as much as possible. Richard Dingwall has a detailed blog post on some alternative techniques to avoid soft-delete. But if you have to have soft delete, as shown in the example, its worth remembering the aforementioned points.

Sunday, February 22, 2009

Working on a Data Warehouse project

I have been busy for the last month as we, at Code71, rolled out the first release of the Business Intelligence project for Vuneu Media. I will give you a brief on my project here so that you can follow my coming posts.

The used vehicle dealers buy their vehicles from different Auctions across the country. When making a buy decision, they try to find answer to these questions-

  1. What vehicles are customers buying?
  2. How much it costs to get those from Auctions?
  3. How many of these are receiving cash advances and by which lenders?
  4. What is my potential gross profit if I deal a vehicle?
  5. Where can I find the vehicles in Auctions?

So, we are getting the information from data vendors. Since, the amount of data is really large (nearly 10 million rows per month in total) for human manipulation, we are adding real value to the business by providing software intelligence. We developed the technology to put the pieces together and make it handy for the dealers.

We are using the following software/technologies:-

  1. SQL Server 2008, 64 Bit, Standard Edition.
  2. SQL Server Integration Service.
  3. SQL Server Reporting Service.
  4. ASP.Net, C#, WCF.
  5. SQL Server Analysis Service (is in the pipeline, not used yet!)
  6. Windows Server 2008, 64 Bit.

I came across a number of innovations as well as issues at each step of the project. The database schema design was itself a challenge since data come from different vendors in various formats and granularities.

Here, I will start with an issue resulting from the “64 bit” SQL Server limitations-

SQL Server 64 bit versions include Integration Services, but it lacks support for some components. One such is, Microsoft Jet Driver. We use this driver to load data from Excel files.

To load Excel data using SSIS, you have the following options:-

1. Locate Program Files (x86)\Microsoft SQL Server\90\DTS\Binn\DTExec.exe. Then use this with your options. You can take a look at the command arguments by invoking DTExec –? This way you can run the SSIS packages using command line.

2. Using the SQL Server Agent Job is rather simple. Just Create/edit your job step having the Excel, then select Use 32 Bit in the Options/Execution Options tab. It will automatically use the 32 bit version of DTExec.

3. If you are using the IDE, you can turn of 64 bit by un-checking the Use 64 Bit Runtime at the Project Properties > Debugging.

More information on 64 bit SQL Server related issues are here at MSDN.

Also, I got help from Blog N-Technologies.