Friday, March 06, 2009

The RANK() Function for Numbering on Groups/Partitions in SQL Server

Previously, I used the ROW_NUMBER() function to get the monotonically increasing row number of a result set from a SQL query. But, this time the requirement was a bit different as follows.

I have many combinations of Year, Make, Model and Trim of vehicles in my database. I also have potential profit on sale of each of the combinations. I need to produce, the top 5 trims for a Year, Make, Model combination that yields maximum profits.

So, to solve this problem, I cannot simply use ROW_NUMBER(). Basically, I need to find the row numbers starting from 1 for each group of Year, Make and Model. Then, take only those rows having a rank <= 5. Ranking based on such groups/partitions can be easily done by using the RANK() function. I am quoting the syntax from MSDN here-

RANK ( )    OVER ( [ < partition_by_clause > ] < order_by_clause > )

So, in my case the query is something like the following-


RANK ( ) OVER ( PARTITION BY Year, MAKE, MODEL ORDER BY Profit DESC, Trim )


So, the RANK() is similar to ROW_NUMBER() in the context of the PARTITION. As if, the RANK() restarts counting from 1 for each partition. I think, like me, many of you may find it new and useful in similar requirements.


Without using this function, it will be much difficult to produce the desired result set. However, if you have a nicer/alternative solution, please teach me!