Selecting the Top N in each Category

When we write queries in TSQL selecting the top ‘n’ number of rows is easy. It can be achieved by a query similar to the following.

SELECT TOP(10) Names FROM [Users]
WHERE IsActive = 'True'
ORDER BY ApprovedOn DESC

But when selecting the items which are from different tables and with subcategories this is quite tricky.

I had a scenario where Users can comment on each service/product. Each service/product is listed under a sub category. The need is the query should list down the recent 5 comments under each service sub category with the details of who made the comment.

Apparently I have to JOIN 4 tables. Selecting the TOP(n) from each category is challenging. I searched the Internet and some solutions say that execute the queries to get the top 5 in each category and combine them using the UNION. That doesn’t sound good and in my scenario the number of categories is dynamic. (So If I have to go with the above solution I have to write COUNT for sub categories and then follow that with bulky TSQL stored procedure).

This is a method I tried by combining some techniques mentioned in the Internet.

Table Structure (Tables contain more fields, but in the picture only the related fields are shown)

Capture

There two ways you can write the query. First using a temporary table which is not advisable in terms of efficiency. Second is using sub queries.

Using Temporary tables.

DECLARE @Temp TABLE
(
    Row INT,
    RealServiceID BIGINT,
    CategoryName VARCHAR(100),
    ServiceName VARCHAR(100),
    Comment VARCHAR(MAX),
    Username VARCHAR(50)
)

INSERT INTO @Temp
SELECT ROW_NUMBER() OVER
(PARTITION BY SC.CategoryID ORDER BY SC.CategoryName) AS Row, 

RS.RealServiceID, SC.CategoryName, RS.ServiceName, CM.Comment, US.Username
    FROM RealServices AS RS
              JOIN Comments AS CM ON
        RS.RealServiceID = CM.CommentOn JOIN
            ServiceCategory AS SC ON RS.CategoryID = SC.CategoryID
        JOIN Users AS US ON CM.UserID = US.UserID
    ORDER BY SC.CategoryName, CM.CommentOn

SELECT * FROM @Temp WHERE Row <= 5

Using Sub queries

SELECT RowNumber, RealServiceID, CategoryID, CategoryName, ServiceName,
    Comment, Username, SubmissionDate
FROM (
    SELECT ROW_NUMBER() OVER
(PARTITION BY SC.CategoryID ORDER BY SC.CategoryName, CM.CommentOn) AS RowNumber,
        RS.RealServiceID,
        SC.CategoryID,
        SC.CategoryName,
        RS.ServiceName,
        CM.Comment,
        US.Username,
        CM.SubmissionDate

    FROM RealServices AS RS
        JOIN Comments AS CM ON
        RS.RealServiceID = CM.CommentOn JOIN
            ServiceCategory AS SC ON RS.CategoryID = SC.CategoryID
        JOIN Users AS US ON CM.UserID = US.UserID
    ) AS InnerTable
WHERE RowNumber <= 5
 ORDER BY ServiceName, SubmissionDate DESC

Feel free to make comments on this Smile

SQL Stored Procedures

You can have a good sample on SQL Server Stored procedures using ADO.NET in C# in this link. http://www.codeproject.com/KB/cs/simplecodeasp.aspx

The article is simple and easy to understand.

Feel free to ask any questions on stored procedures as I will try my best to answer.

SQL Server Protocols

Based on some missing images this document might appear in a broken flow

SQL Server is the Microsoft’s Enterprise Data Management system. Recent release is SQL Server 2008 R2 (x64 bit only). We all know about the Database Management System and Database related concepts.
SQL Server comes with two powerful tools SQL Server Management Studio and SQL Server Configuration Manager.

To start learning SQL Server 2008 you should have an instance of database engine. Install it and connect to using SQL Server Management Studio.

To connect to the default instance type (local) or . (a dot) or server name. In your local machine the server name would be your machine’s name, in my case Thurupathan-PC or type localhost Typically you can specify the <server name>/<instance name>. If you do not provide the instance name it will connect to the default instance.

The parameters you can specify as server are

  • (local) – to connect to the local default instance
  • . – to connect to the default instance
  • localhost – to your local machine to default instance (Thurupathan-PC also same)
  • server name/instance name (ex – Thurupathan-PC/Sharepoint)

Here server is the physical (or a virtual ) machine. You can install more than one instance of database engine in one server.

You are connected to the default instance. Download the Adventure Works 2008 database from http://www.codeplex.com . This comes as a .msi file you can just double click it to install. (I didn’t explain the Restore Database process here for older Adventure works scripts, but I’ll explain restoring the database using another example). You do need to have the adventure works database in order to do the following exercise.

Then you are done with the learning track of the SQL Server 2008 or SQL Server 2008 R2.

Here I do not explain from the very beginning SELECT statements as you can Google for plenty of SQL statement tutorials. I explain only some specific things related to SQL Server configurations.

SQL Server runs on top of SQL Server OS which runs on a Windows OS. It has four protocols to connect with the database engine. You can view them through the SQL Server Configuration Manager.

MS SQL Server uses the port 1433 as its default port.

The default MSSQLSERVER instance can be configured with 4 different protocols.

  • Shared Memory – This protocols is used to run the server in the local machine.(You can right click and make it enable and disable)
  • Named Pipes – This used serve the machines in the local network. (typically in a specific broadcasting domain)
  • TCP/IP – This TCP/IP as we all know used to connect over the Internet, beyond a single broadcasting domain.
  • VIA – Virtual Interface Adapter is used to connect machines using specific machines by specifying the NIC number. This is also used to make transactions with different Database management systems.(I’m searching for some good explanation for VIA)

You can get a question that if Shared Memory, Named Pipes and TCP/IP are enabled which will be used for a connection. The Shared Memory cannot gain access to an external machine so no need to talk about that.

If the server and the client are in a single broadcasting domain what will be used ? Named Pipes or TCP/IP. It is based on which order the protocols are configured. You can change the order by switching to the SQL Native Client 10.0 Configuration and Client Protocols.Named Pipes are typically faster than the TCP/IP.

And Finally by executing this query (in SQL Server Management Studio) you can view the currently used protocol. (After executing in your local machine you’ll get the Shared Memory as result)

SELECT net_transport FROM sys.dm_exec_connections WHERE session_id = @@SPID;

You can disable various protocols and try to the above query to get different results. (If you are enabling or disabling the protocols you should have to disconnect from the instance and restart the service to take effect)