Let’s JOIN

JOIN is one of the common task we perform against databases. This post explains it using TSQL.

INNER JOIN

In a simple JOIN condition we normally do not specify the word INNER, so when we write just JOIN it means an INNER JOIN. INNER JOIN returns only the matching the rows of the tables.

The queries can be executed against the Adventure Works LT database

A simple INNER JOIN between two tables

SELECT SP.ProductID, SP.Name, SPC.Name AS ‘Category’
    FROM SalesLT.Product AS SP INNER JOIN SalesLT.ProductCategory AS SPC
        ON SP.ProductCategoryID = SPC.ProductCategoryID

INNER JOINs can be specified in the WHERE clause as well, but it is recommended to keep them in the FROM. This has no performance issues at all, because even when you specify the JOIN condition in WHERE clause SQL Server executes an INNER JOIN internally.

OUTER JOINS

OUTER JOINS are used to return all rows from one table and matching rows from the other tables. It fills the NULLS where the matching rows cannot be found.

Again the keyword OUTER is optional, but we have to specify whether it is LEFT, RIGHT or FULL OUTER JOIN. When you specify LEFT OUTER JOIN the table in the left side of the JOIN clause is considered as the outer table; means that all rows from this table will be returned with the matching rows of the other table and NULLs replaces where the matching is not there.

Same applies to RIGHT OUTER JOIN, when you use the RIGHT OUTER JOIN the table in the right side of the JOIN is your outer table. So you can get the same results in both LEFT and RIGHT outer JOINs by changing the place of the tables.

SELECT SC.FirstName, SH.SubTotal FROM SalesLT.Customer AS SC
    INNER JOIN SalesLT.SalesOrderHeader AS SH
        ON SC.CustomerID = SH.CustomerID

This results only the Customers who have placed orders with their first name along with the sub total. (It returns 32 rows)

image

When specify a LEFT OUTER JOIN like this

SELECT SC.FirstName, SH.SubTotal FROM SalesLT.Customer AS SC
    LEFT OUTER JOIN SalesLT.SalesOrderHeader AS SH
        ON SC.CustomerID = SH.CustomerID

It returns all the rows of the Customer table with the sub total where filling NULLs against the customers who have not placed any orders. Since OUTER JOIN pulls all the rows from the outer table, this returns the exact number of rows that the Customer table has. (847 rows)

image

 

In this case if you specify the RIGHT OUTER JOIN it pulls all the rows from the OrderHeader table and matching rows from the Customer table. Since the OrderHeader table has FOREIGN KEY of CustomerID and it is NOT NULLABLE, it returns the same result as the INNER JOIN does.

As a generic decision we can say that if the table has a referential integrity defined with NOT NULL constraint, when we specify the OUTER JOIN based on that table (either LEFT or RIGHT) this gives the exact results as INNER JOIN does.

 

Apart from LEFT and RIGHT OUTER JOINs FULL OUTER JOIN returns all the rows from all the tables specified, by filling NULLs in both sides. This is useful in finding the unmatched rows when there is no referential integrity is defined among the tables.

CROSS JOIN

CROSS JOIN is another type which is the Cartesian product. It matches all rows from one table with all rows from the other table, thus returning m * n rows where m is the number of rows of the first table and n being the number of rows in the second table.

Since CROSS matches every row with all the rows, it does not have an ON condition. (apparently it doesn’t need to have one)

SELECT SC.FirstName, SH.SubTotal FROM SalesLT.Customer AS SC
    CROSS JOIN SalesLT.SalesOrderHeader AS SH

The above query returns 21104 rows (847 rows from Customer table and 37 rows from OrderHeader table)

SELF JOIN

SELF JOIN can be used in the scenarios where the same table has to be joined with itself. For example if you want to find the name of the employees and the name of their supervisors (where the supervisor also an employee) you make SELF JOINs.

There is no any special keyword for SELF JOIN. (as it is not a special operation, it is just a special case where we can use all the other JOIN types)

SQL Server recommends to use CTEs when there are complex and recursive SELF JOINs scenarios occur.

Advertisement

1 thought on “Let’s JOIN

  1. I forgot to mention about the aliases that we have to use for the SELF JOIN.

    When you do SELF JOIN you have to put aliases to differentiate the table instances. And in SELF JOINS aliases are not optional

Comments are closed.