Intro To Logical Joins In SQL Server

Hi everyone, here we explored some logical ways of joining table as an intro to the physical operators.

Inner joins

It is the most common join type,

For a join predicate we specify, show the matching rows only,

For example:

use AdventureWorks2022
go
select*
from Sales.SalesOrderHeader soh inner join Sales.SalesOrderDetail SOD
                                  ON soh.SalesOrderID = SOD.SalesOrderID
go

So here, show all the rows where SalesOrderId is equal

A visual presentation would be:

The implicit join is an inner join, maybe because it is the most common:

use AdventureWorks2022
go
select*
from Sales.SalesOrderHeader soh join Sales.SalesOrderDetail SOD
                                  ON soh.SalesOrderID = SOD.SalesOrderID
go

It is a fully commutative join type, meaning SalesOrderHeader inner join SalesOrderDetail = SalesOrderDetail inner join SalesOrderHeader

Outer Joins:

Suppose we want all the details from SalesOrderHeader, even if it did not satisfy the matching predicate, meaning we did not have the details we wanted in the SalesOrderDetail

We could use:

use AdventureWorks2022
go
select*
from Sales.SalesOrderHeader soh left outer join Sales.SalesOrderDetail SOD
                                  ON soh.SalesOrderID = SOD.SalesOrderID
go

So it would get us all the rows from the left side, and any matches from the right, it would replace the non-matching rows with null

A Visual representation would be like:

Now, since there are no non-matching rows here, we won’t see the nulls we were talking about

But if we do the following:

Use adventureWorks2022
go
CREATE TABLE UserDirectory (
    UserID int ,
    UserAlias varchar(50)
);

CREATE TABLE PurchaseHistory (
    UserID int,
    Product varchar(50)
);

INSERT INTO UserDirectory (UserID, UserAlias) VALUES
    (500, 'PixelMaster'),
    (501, 'CodeWizard'),
    (502, 'DataNinja');

INSERT INTO PurchaseHistory (UserID, Product) VALUES
    (505, 'GamingConsole'),
    (501, 'MechanicalKeyboard'),
    (501, 'UltraWideMonitor'),
    (506, 'NoiseCancellingHeadphones'),
    (503, 'Smartwatch');
go

Then do this:

select*
from UserDirectory left outer join PurchaseHistory 
                      on UserDirectory.UserID = PurchaseHistory.UserID
go

We get:

The nulls are the non-matching rows in the second table.

With a full outer join, we find all the rows, and combine them, but we match the matching, so either table could produce nulls like:

select*
from UserDirectory full outer join PurchaseHistory 
                      on UserDirectory.UserID = PurchaseHistory.UserID
go

Now full outer joins are commutative, so a full outer join b = b full outer join a

And, a a left outer join b = a b right outer join a, since here we are trying to preserve a

CROSS JOIN

It is a Cartesian product of both tables, meaning it matches every row of one table with every row of the other table

We can’t specify a join predicate for it, but we can specify a predicate in the where clause

In that case, it would be an inner join

So:

select*
from userdirectory cross join purchasehistory

It can only use a nested loop, and we would get a warning in the execution plan since there is no join predicate, but if what we are trying to achieve a caretsian product of the two tables, then this is what we should expect, so we can ignore it there

It would be an inner join if we did the following:

go
select*
from userdirectory cross join purchasehistory
where UserDirectory.UserID = PurchaseHistory.UserID
go

CROSS APPLY

It was introduced initially to pass a value into a table valued function like a DMF or a user defined one

So here we are trying to pass a parameter into a function

If we create the following:

create function dbo.userid(@userid int)
returns @PurchaseHistory table (product varchar(25))
as 
begin 
 insert into  @PurchaseHistory 
 select product 
 from PurchaseHistory 
 where UserID = @userid
 return
end
go

So here, we are getting the product from the purchase history if we pass the userid as a parameter like:

select*
from UserDirectory 
cross apply dbo.userid(userid);

We get:

Also we can get all the UserIDs whether they have a product or not but using outer apply like:

go
select*
from UserDirectory 
outer apply dbo.userid(userid);
go

Now, there are alot more things to do with it, but as an intro, this is enough

Semi join and anti-semi join

A semi join would return only the row from the first table it matches the prediacte in the second

An anti-semi join would only return the table from the first if it did not match, so it is whatever that is not the semi join.

Now, we can’t ask for it explicitly, but, if we do the following:

EXISTS

Like using the exists syntax:

select*
from UserDirectory u 
where exists(
                    select*
                    from PurchaseHistory p
                    where p.userid = u.userid
                    
         );

We get:

So it only got the rows from the first where there are matches in the second.

But, what UserID where a duplicate like if we insert this:

insert into userdirectory(userid,useralias)
values(501,'whatever'),(501,'another')

after that :

select*
from UserDirectory u 
where exists(
                    select*
                    from PurchaseHistory p
                    where p.userid = u.userid
                    
         );

It would only return it one time

We can the semi-join in the execution plan like:

NOT EXISTS

Would be the anti-semi join:

select*
from UserDirectory u 
where not exists(
                    select*
                    from PurchaseHistory p
                    where p.userid = u.userid
                    
         );

And with this we finish this intro.

Leave a Reply

Your email address will not be published. Required fields are marked *