Monday, December 27, 2010

How do I return row numbers with my query



Often, people want to "invent" an identity, or rank, on the fly. So their original result set would look like this: 
 
Lastname Firstname 
-------- --------- 
Evans    Bob 
Smith    Frank
 
And they would want this: 
 
Rownum Lastname Firstname 
------ -------- --------- 
1      Evans    Bob 
2      Smith    Frank
 
This would act like Oracle's ROWNUM, which isn't supported in SQL Server. 
 
Of course, once you've retrieved this resultset into your ASP page, you could use a counter to increment as you're processing. This is by the easiest way, e.g. 
 
<% 
    ' ... 
    set rs = conn.execute(sql) 
    counter = 0 
    do while not rs.eof 
        counter = counter + 1 
        response.write counter & " " 
        response.write rs(0) & "<br>" 
        rs.movenext 
    loop 
    ' ... 
%>
 
However, some people really, really, really want the row number to come back from the database. It's a little less efficient, but let's examine a few methods. Given this sample data: 
 
SET NOCOUNT ON 
 
CREATE TABLE people 

    firstName VARCHAR(32), 
    lastName VARCHAR(32) 

GO 
 
INSERT people VALUES('Aaron', 'Bertrand') 
INSERT people VALUES('Andy', 'Roddick') 
INSERT people VALUES('Steve', 'Yzerman') 
INSERT people VALUES('Steve', 'Vai') 
INSERT people VALUES('Joe', 'Schmoe')
 
The first method we'll try is a COUNT with a GROUP BY: 
 
SELECT 
    rank = COUNT(*), 
    a.firstName, 
    a.lastName 
FROM 
    people a  
    INNER JOIN people b 
    ON  
        a.lastname > b.lastname 
        OR 
        ( 
            a.lastName = b.lastName 
            AND 
            a.firstName >= b.firstName 
        ) 
GROUP BY 
    a.firstName, 
    a.lastName 
ORDER BY 
    rank
 
We can also try a COUNT as a subquery, which doesn't require GROUP BY (which means you could include other columns in the outer query). 
 
SELECT 
    rank = ( 
        SELECT COUNT(*)  
        FROM people b 
        WHERE  
        a.lastname > b.lastname 
        OR 
        ( 
            a.lastName = b.lastName 
            AND a.firstName >= b.firstName 
        ) 
    ), 
    a.firstName, 
    a.lastName 
FROM 
    people a 
ORDER BY 
    a.firstName, 
    a.lastName
 
Results in all cases: 
 
rank firstName lastName 
---- --------- -------- 
1    Aaron     Bertrand 
2    Andy      Roddick 
3    Joe       Schmoe 
4    Steve     Vai 
5    Steve     Yzerman
 
Note that if you have duplicates in your table, you will end up with something like this: 
 
1    Aaron     Bertrand 
3    Joe       Schmoe 
3    Joe       Schmoe
 
So, to avoid this, you might want to make sure that either (a) you avoid and remove duplicates (see Article #2431); or (b) if duplicates are allowed and make sense for your data model, that you have some other primary key or unique identifier. Then, you can make it a part of the query; for example: 
 
SET NOCOUNT ON 
 
CREATE TABLE people 

    peopleID INT IDENTITY(1,1) PRIMARY KEY, 
    firstName VARCHAR(32), 
    lastName VARCHAR(32) 

GO 
 
INSERT people VALUES('Aaron', 'Bertrand') 
INSERT people VALUES('Andy', 'Roddick') 
INSERT people VALUES('Steve', 'Yzerman') 
INSERT people VALUES('Steve', 'Yzerman') 
INSERT people VALUES('Steve', 'Vai') 
INSERT people VALUES('Joe', 'Schmoe') 
 
SELECT 
    rank = ( 
        SELECT COUNT(*) 
        FROM people b 
        WHERE a.lastName > b.lastName 
        OR 
        ( 
            a.lastname = b.lastname 
            AND a.firstName >= b.firstName 
        ) 
    ) - ( 
        SELECT COUNT(*) FROM 
        people b 
        WHERE a.lastName = b.lastName 
        AND a.firstName = b.firstName 
        AND a.peopleID < b.peopleID 
    ), 
    a.firstName, 
    a.lastName 
FROM 
    people a 
ORDER BY 
    a.lastName, 
    a.firstName
 
Results: 
 
rank firstName lastName 
---- --------- -------- 
1    Aaron     Bertrand 
2    Andy      Roddick 
3    Joe       Schmoe 
4    Steve     Vai 
5    Steve     Yzerman 
6    Steve     Yzerman
 
Grouping within groups 
 
Often, you'll want a more complex row number scheme, for example you might want to rank within groups of a hierarchy. Let's say we wanted to list sports teams, and assign "ranks" alphabetically, within each city: 
 
CREATE TABLE #teams 

    city VARCHAR(20), 
    team VARCHAR(20) 

 
SET NOCOUNT ON 
 
INSERT #teams SELECT 'Boston', 'Celtics' 
INSERT #teams SELECT 'Boston', 'Bruins' 
INSERT #teams SELECT 'Boston', 'Red Sox' 
INSERT #teams SELECT 'New York', 'Yankees' 
INSERT #teams SELECT 'New York', 'Mets' 
INSERT #teams SELECT 'New York', 'Knicks' 
INSERT #teams SELECT 'New York', 'Rangers' 
INSERT #teams SELECT 'New York', 'Islanders' 
INSERT #teams SELECT 'New York', 'Jets' 
INSERT #teams SELECT 'New York', 'Giants' 
INSERT #teams SELECT 'Chicago', 'Black Hawks' 
INSERT #teams SELECT 'Chicago', 'Cubs' 
INSERT #teams SELECT 'Chicago', 'White Sox' 
INSERT #teams SELECT 'Chicago', 'Bears' 
INSERT #teams SELECT 'New England', 'Patriots' 
 
SELECT city, team, rank =  

    SELECT COUNT(*) 
    FROM #teams t2 
    WHERE t2.city = t1.city 
    AND t2.team <= t1.team 

    FROM #teams t1 
    ORDER BY city, team 
 
DROP TABLE #teams
 
Results: 
 
cityteamrank
----------------------------
Boston Bruins1
BostonCeltics2
BostonRed Sox3
ChicagoBears1
ChicagoBlack Hawks2
Chicago Cubs3
ChicagoWhite Sox4
New EnglandPatriots 1
New YorkGiants1
New YorkIslanders2
New YorkJets3
New YorkKnicks4
New YorkMets 5
New YorkRangers6
New YorkYankees7
 
Keep in mind that, since your presentation tool (Crystal Reports, ASP, PHP, what have you) is going to have to treat every row separately anyway, it makes sense to just retrieve the rows in the correct order, and let the application compare every row to see if this is a new city or not, and accordingly increment the count or start over. This will greatly reduce the amount of strain you're putting on the database.


No comments:

Post a Comment