SARGable Predicates Storage Engine Part 7

Today we are going to show some examples of seeks in:

  • Mathematical calculations
  • Dates
  • Prefix search(Like ‘value%’)
  • substring search(like’%value%’)
  • dynamic seeks their physical and logical operators
  • multiple dynamic seeks

Mathematical calculations

now sometimes when we try to process a row as a parameter or a variable before outputting the result SQL server has to process each row and get the result for each, see if it matches our conditions, and then it would output it

for example:

use testdatabase 
declare @calc int 
set @calc = 5 
select id 
from dbo.heapVSclustered 
where id +1 = @calc

now here it has to process the row before it outputs so it scans

now if we re-write the query like this:

use testdatabase 
declare @calc int 
set @calc = 5 
select id 
from dbo.heapVSclustered 
where id= @calc-1

now here we would get a seek since it had to go to the row first

Dates

create table date_mani(
        id int identity(1,1) primary key clustered not null,
        dates datetime
);
with 
n1(c) as (select 0 union all select 0),
n2(c) as (select 0 from n1 as t1 cross join   n1 as t2),
n3(c) as (select 0 from n2 as t1 cross join  n2 as t2 ),
n4(c) as (select 0 from n3 as t1 cross join  n3 as t2),
n5(c) as (select 0 from n4 as t1 cross join  n4 as t2),
Ids(id) as (select ROW_NUMBER() over (order by(select null)) from n5)
insert into date_mani(dates)
select DATEADD(day,id-1,'2022-01-01')
from ids
---
create nonclustered index ix_dates on date_mani(dates)

now this table has some dates,

if we declare a variable like

declare @year datetime
set @year = '2005'
select dates 
from date_mani
where DATEPART(year,dates) =@year

now here we are trying to locate the years that have 2001 in it

if we do that like this SQL serves has to go and process each year and compare it to the variable

before knowing what it is looking at but if we write it like this


declare @year datetime
set @year = '2025'
select dates
from date_mani
where  dates>=@year 
and  dates<dateadd(year,1,@year)

now here we get a seek

why did we get a seek? SQL server knows the value we want before the execution, so it does not have to figure out what it is before going in, it just goes to the index and picks it up for us

and the range here means get me the year if the value matches year = 2025

but it is less than the next year

which is exactly what we want =2025

another example with dates

here:

select dates
from date_mani
where dateadd(day,7,dates)>getdate()

now here it had to calculate each date and compare it to today

now if we do the following:

select dates 
from date_mani 
where dates>dateadd(day,-7,getdate())

now we are seeking

Prefix search

now, let’s create this table:

create table like_ma(
id int identity(1,1) primary key clustered not null,
col1 varchar(200) 
);
with n1(c) as (select 0 union all select 0),
n2(c) as( select 0 from n1 as t2 cross join n1 as t1 ),
n3(c) as (select 0 from n2 as t2 cross join n2 as t1),
n4(c) as ( select 0 from n3 as t1 cross join n3 as t2),
n5(c) as ( select 0 from n4 as t1 cross join n4 as t2),
ids(id) as ( select ROW_NUMBER() over( order by (select null) )from n5)
insert into like_ma(col1)
select 'value '+ cast(id as varchar(10))
from ids

After that let’s do the following


create nonclustered index ix_col1_like_ma on like_ma(col1) 

now let's do the following

select col1 
from like_ma
where LEFT(col1,7) = 'value 1'

now here it scans since it has to compare each row

but if we do it like this:

select col1 
from like_ma
where col1 like'value 1%'

We get our seek since it knows the rows, it just needs to get and compare them.

Substring search

now here:

select col1 
from like_ma 
wherer col1 like'%value 1%'

Now how could we look up substrings with index seeks?

trigrams

or FTS, full-text search

FTS would need its own series, it has its own engine

but as an intro if we wanted to look up language phrases

it needs a column that is not nullable and a primary key which we have as id

first , we need to check if the full-textsearch engine is installed

we can do that by:

select serverproperty('IsFullTextInstalled');

in our case it was

then we need to create a catalog for it which is the metadata of the full-text index file

now if we don’t have one SQL server, we would just use the default, but would specify it in this case

create fulltext catalog ft_catalog as default;

then we have to create a full-text index on our column and before that, we have to get the primary key:


select*
from sys.indexes 
where object_id = OBJECT_ID('dbo.like_ma')

ours was

PK__like_ma__3213E83FDFC150C0

nNowlet’s create the full-text index:

create fulltext index  on like_ma(col1) 
key index PK__like_ma__3213E83FDFC150C0
 on ft_catalog

Now we select like this

 select col1
from dbo.like_ma 
where contains(col1,'"value 65521"')

Now the execution plan was like this:

more on that in another series,

and the trigrams will be covered in another blog.

Dynamic seeks

Paul White

now if we use dynamic SQL to find the value 1 like:

select col1 
from like_ma
where col1 like N'Value%'

we would get a type conversion

now we could get around this but this is too complex for here, instead, we are going to create a new table with nvarchar in it:

create table like_ma_n(
id int identity(1,1) primary key clustered not null,
col1 nvarchar(200) 
);
with n1(c) as (select 0 union all select 0),
n2(c) as( select 0 from n1 as t2 cross join n1 as t1 ),
n3(c) as (select 0 from n2 as t2 cross join n2 as t1),
n4(c) as ( select 0 from n3 as t1 cross join n3 as t2),
n5(c) as ( select 0 from n4 as t1 cross join n4 as t2),
ids(id) as ( select ROW_NUMBER() over( order by (select null) )from n5)
insert into like_ma_n(col1)
select N'value '+ cast(id as nvarchar(10))
from ids
create nonclustered  index ix_whatever on dbo.like_ma_n(col1)

now here we still have implicit conversions since id is an int

but we don’t care about that right now, let’s just populate the table:

select col1 
from dbo.like_ma_n
where col1 like N'value 1%'

and we would get our index seek

now what if we use variables:

declare @value nvarchar(100)
set @value = N'value 1%'
select col1 
from dbo.like_ma_n
where col1 like @value

We would get the following:

So we have our seek, but we also have 3 other physical operators

So what are these?

we could put some traces( for more details check out our blog on logical tree generation) to see what the query optimizer did:

declare @value nvarchar(100)
set @value = N'value 1%'
select col1 
from dbo.like_ma_n
where col1 like @value
OPTION (
  USE PLAN N'',
  QUERYTRACEON 3604,
  QUERYTRACEON 8607,
  QUERYTRACEON 7352,querytraceon 8605,QUERYTRACEON 8606,QUERYTRACEON 8607,
QUERYTRACEON 8608,QUERYTRACEON 3604,
QUERYTRACEON 8619,QUERYTRACEON 8620,
QUERYTRACEON 8621,QUERYTRACEON 8615,
QUERYTRACEON 8675)

now if we did not use the plan, SQL server will consider anything that has traces as non-trivial

so we have to use the use plan hint to see our plan otherwise we would get a seek

and we have to use

<ColumnReference Column="@value" ParameterDataType="nvarchar(100)" ParameterRuntimeValue="N&apos;value 1%&apos;" />

in that specific line to get escape single quotes in single quotes, you should do the same

in use plan, you just copy the plan XML and put it in there

we got the following output:


*** Converted Tree: ***
    LogOp_Project QCOL: [TestDatabase].[dbo].[like_ma_n].col1

        LogOp_Select

            LogOp_Get TBL: dbo.like_ma_n dbo.like_ma_n TableID=1746105261 TableReferenceID=0 IsRow: COL: IsBaseRow1000 

            ScaOp_Intrinsic like

                ScaOp_Identifier QCOL: [TestDatabase].[dbo].[like_ma_n].col1

                ScaOp_Identifier COL: @value 

                ScaOp_Const TI(nvarchar collate 872468488,Var,Trim,ML=2) XVAR(nvarchar,Not Owned,Value=EMPTY)

                ScaOp_Intrinsic LikeRangeStart

                    ScaOp_Identifier COL: @value 

                    ScaOp_Const TI(nvarchar collate 872468488,Var,Trim,ML=2) XVAR(nvarchar,Not Owned,Value=EMPTY)

                ScaOp_Intrinsic LikeRangeEnd

                    ScaOp_Identifier COL: @value 

                    ScaOp_Const TI(nvarchar collate 872468488,Var,Trim,ML=2) XVAR(nvarchar,Not Owned,Value=EMPTY)

                ScaOp_Intrinsic LikeRangeInfo

                    ScaOp_Identifier COL: @value 

                    ScaOp_Const TI(nvarchar collate 872468488,Var,Trim,ML=2) XVAR(nvarchar,Not Owned,Value=EMPTY)

        AncOp_PrjList 

*******************

So we have:

  • tTheprojection logical operator that shows us what would we get, it defines the objects we need it has two child operators AncOp_PrjList is for internal patching in here:
    • Log_Op select: filters the data that qualifies, it is like the where clause, it has two child operators:
      • log_op get: this gets the rows that we need from the table, meaning the ones that get filtered by the other child operators:
      • ScaOp_Intrinsic like: it is the like function, and it has 6 child operators:
        • ScaOp_Identifier QCOL the first one defines the first side of the like function, argument, here it is our table’s col1
        • ScaOp_Identifier QCOL the second side of our like argument here it is our @value variable
        • ScaOp_Const TI: could be for internal patching, or for the ESCAPE argument if that stuff existed
        • ScaOp_Intrinsic LikeRangeStart: this is the start of our dynamic seek, but wait a minute, we did not have a range, so how did this operator get generated, well basically since SQL server can not seek in the index dynamically in the seek operator itself it had to create 3 other operators we talked about, now first it loads the @value in the constant scan then it creates 3 ranges in the compute scalar one of them LikeRangeStart, and the other 2 operators as expressions, then it puts these ranges in the seek predicates of the index seek so it could scan the range in at the actual index using the nested loop join now here it starts defining the range by 2 child operators:
          • ScaOp_Identifier COL: @value the first one identifies, sets where should we get the start from, here it is our variable
          • the other one could be for the ESCAPE argument if we used it but since we did not it is not empty
        • ScaOp_Intrinsic LikeRangeEnd: same here it is trying to define the end of our like range
        • ScaOp_Intrinsic LikeRangeInfo: now this is used for internals for the query optimizer it relates @value to the engine

Input tree:

** Input Tree: ***
        LogOp_Project QCOL: [TestDatabase].[dbo].[like_ma_n].col1

            LogOp_Select

                LogOp_Get TBL: dbo.like_ma_n dbo.like_ma_n TableID=1746105261 TableReferenceID=0 IsRow: COL: IsBaseRow1000 

                ScaOp_Intrinsic like

                    ScaOp_Identifier QCOL: [TestDatabase].[dbo].[like_ma_n].col1

                    ScaOp_Identifier COL: @value 

                    ScaOp_Const TI(nvarchar collate 872468488,Var,Trim,ML=2) XVAR(nvarchar,Not Owned,Value=EMPTY)

                    ScaOp_Identifier COL: ConstExpr1002 

                    ScaOp_Identifier COL: ConstExpr1003 

                    ScaOp_Identifier COL: ConstExpr1004 

            AncOp_PrjList 

*******************

now here it just named each of the ranges as ConstExpression columns instead of referencing the variable in each

        PhyOp_Filter

            PhyOp_Range TBL: dbo.like_ma_n(2) ASC  Bmk ( QCOL: [TestDatabase].[dbo].[like_ma_n].id) IsRow: COL: IsBaseRow1000 

                ScaOp_Intrinsic like

                    ScaOp_Identifier QCOL: [TestDatabase].[dbo].[like_ma_n].col1

                    ScaOp_Identifier COL: @value 

                    ScaOp_Const TI(nvarchar collate 872468488,Var,Trim,ML=2) XVAR(nvarchar,Not Owned,Value=EMPTY)

                    ScaOp_Identifier COL: ConstExpr1002 

                    ScaOp_Identifier COL: ConstExpr1003 

                    ScaOp_Identifier COL: ConstExpr1004 

            ScaOp_Intrinsic like[** 2 **]

now this is the output plan, as you can see we have the range operator, which would be applied as and index seek

then we have our like operator

with the whole previous thing, the same all the way

now after that:

Inner Join Nested Loops (0) 

                    [QCOL: [TestDatabase].[dbo].[like_ma_n].col1 TI(nvarchar collate 872468488,Null,Var,Trim,ML=400)] 0000013739EC2190  P 

  Compute Scalar Compute Scalar (1) 

                      [CALC:COL: Expr1005  TI(nvarchar collate 872468488,Null,Var,Trim,ML=200)] 000001373B1E1A30  N 

                      [CALC:COL: Expr1006  TI(nvarchar collate 872468488,Null,Var,Trim,ML=200)] 000001373B1E1B20  N 

                      [CALC:COL: Expr1007  TI(int,Null,ML=4)] 000001373B1E1C20  N 

    Constant Scan Constant Scan (2) 

  Index Seek Index Seek (3)  CP 

                      [QCOL: [TestDatabase].[dbo].[like_ma_n].col1 TI(nvarchar collate 872468488,Null,Var,Trim,ML=400)] 0000013739EC2190  P 

now 0 and them here references the memo

we have the same as the physical plan

we see out 3 ranges in the compute scalar

we have our nested loop that joins with the ranges so it can seek it

2 in here

Group 2: 
   0 ScaOp_Const   Cost(RowGoal 0,ReW 0,ReB 0,Dist 0,Total 0)= 1 (Distance = 0)

so here it references the constant value of our @value

now if we look at the graphical plan:

So in the properties of the constant scan, we have one row = @value passed

Then in compute scalar, we can see our 3 expression columns:

now here it goes there graps the expressions we created

then it creates the computed ranges for them

and call each

after that through the nested loop, it makes sure the the seek fits these ranges, predicates, columns

this was in the seek predicates now as you can see it references our computed ranges

now if we take another look

Seek Keys[1]: Start: [TestDatabase].[dbo].[like_ma_n].col1 > Scalar Operator([Expr1005]),
 End: [TestDatabase].[dbo].[like_ma_n].col1 < Scalar Operator([Expr1006])

the start is bigger than

now we know that expression 1005 is the start

and 1006 is the end

now we have the

now it compares each value it gets from the range scan to the like[@value]

and by that, it gets the row and everything that is like it

Multiple dynamic seeks

what if we did the following:

declare @value1 nvarchar(100)
set @value1 = N'value 1%'
declare @value2 nvarchar(100)
set @value2 = N'value 2%'
declare @value3 nvarchar(100)
set @value3 = N'value 3%'
select col1 
from dbo.like_ma_n
where col1 like @value1
   or col1 like @value2
   or col1 like @value3

we would get the following

now the first constant scans and the compute scalars are the same as before

concatenation does what it is supposed to do combines of all the results and ranges

then the compute scalar defines the ranges again since the concat has to pass its feed to get computed

then the sort classifies them from the first to the last

so they are neatly organized neatly next to each other

the merge interval does exactly what it sounds like merges the intervals, so if there are any overlapping intervals, they should be only returned once, if not then the ranges passed as is

then they are passed to the nested loop joins to see if the range fits the predicate, as seek predicates and we get out output

now if we want to see how the optimizer did this, again we have to use the use plan hint since the query itself is trivial, but once optimized, the same would be done, but all would be included in the seek predicates, not as separate operators

so:

declare @value1 nvarchar(100)
set @value1 = N'value 1%'
declare @value2 nvarchar(100)
set @value2 = N'value 2%'
declare @value3 nvarchar(100)
set @value3 = N'value 3%'
select col1 
from dbo.like_ma_n
where col1 like @value1
   or col1 like @value2
   or col1 like @value3
OPTION (
  USE PLAN N' ' ,QUERYTRACEON 3604,
  QUERYTRACEON 8607,
  QUERYTRACEON 7352,querytraceon 8605,QUERYTRACEON 8606,QUERYTRACEON 8607,
QUERYTRACEON 8608,QUERYTRACEON 3604,
QUERYTRACEON 8619,QUERYTRACEON 8620,
QUERYTRACEON 8621,QUERYTRACEON 8615,
QUERYTRACEON 8675)

dDon’tforget to fix the single quotes like:

<ColumnReference Column="@value3" ParameterDataType="nvarchar(100)" ParameterRuntimeValue="N&apos;value 3%&apos;" />
<ColumnReference Column="@value2" ParameterDataType="nvarchar(100)" ParameterRuntimeValue="N&apos;value 2%&apos;" />
<ColumnReference Column="@value1" ParameterDataType="nvarchar(100)" ParameterRuntimeValue="N&apos;value 1%&apos;" />

Now we don’t see much difference in the trees except in physical operators after optimization:


End of post optimization rewrite, time: 0.042 net: 0.001 total: 0.458 net: 0.014

Inner Join Nested Loops (0) 

                    [QCOL: [TestDatabase].[dbo].[like_ma_n].col1 TI(nvarchar collate 872468488,Null,Var,Trim,ML=400)] 000001373B86F310  P 

  Merge Interval Merge Interval (1) 

                      [LOCAL:COL: Expr1026  TI(nvarchar collate 872468488,Null,Var,Trim,ML=400)] 000001373B86EB40  N 

                      [LOCAL:COL: Expr1027  TI(nvarchar collate 872468488,Null,Var,Trim,ML=400)] 000001373B86ECF0  N 

                      [LOCAL:COL: Expr1028  TI(int,ML=4)] 000001373B86EEA0  N 

    TopN Sort Sort (2) 

                        [COL: Expr1026  TI(nvarchar collate 872468488,Null,Var,Trim,ML=400)] 000001373B86BFA0  N 

                        [COL: Expr1027  TI(nvarchar collate 872468488,Null,Var,Trim,ML=400)] 000001373B86E1E0  N 

                        [COL: Expr1028  TI(int,Null,ML=4)] 000001373B86E2E0  N 

                        [COL: Expr1029  TI(bit,Null,ML=1)] 000001373B86BDB0  N 

                        [COL: Expr1030  TI(int,Null,ML=4)] 000001373B86BEA0  N 

                        [COL: Expr1031  TI(int,Null,ML=4)] 000001373B86E0F0  N 

      Compute Scalar Compute Scalar (3) 

                          [UNION:COL: Expr1026  TI(nvarchar collate 872468488,Null,Var,Trim,ML=400)] 000001373B86B480  N 

                          [UNION:COL: Expr1027  TI(nvarchar collate 872468488,Null,Var,Trim,ML=400)] 000001373B86B560  N 

                          [UNION:COL: Expr1028  TI(int,ML=4)] 000001373B86B630  N 

                          [CALC:COL: Expr1029  TI(bit,Null,ML=1)] 000001373B86B710  N 

                          [CALC:COL: Expr1030  TI(int,Null,ML=4)] 000001373B86B800  N 

                          [CALC:COL: Expr1031  TI(int,Null,ML=4)] 000001373B86B900  N 

        Concatenation Concatenation (4) 

                            [UNION:COL: Expr1026  TI(nvarchar collate 872468488,Null,Var,Trim,ML=400)] 000001373B86B480  N 

                            [UNION:COL: Expr1027  TI(nvarchar collate 872468488,Null,Var,Trim,ML=400)] 000001373B86B560  N 

                            [UNION:COL: Expr1028  TI(int,ML=4)] 000001373B86B630  N 

          Compute Scalar Compute Scalar (5) 

                              [CALC:COL: Expr1020  TI(nvarchar collate 872468488,Null,Var,Trim,ML=400)] 000001373B867DB0  N 

                              [CALC:COL: Expr1021  TI(nvarchar collate 872468488,Null,Var,Trim,ML=400)] 000001373B867EA0  N 

                              [CALC:COL: Expr1013  TI(int,Null,ML=4)] 000001373B8679D0  N 

            Compute Scalar Compute Scalar (6) 

                                [CALC:COL: Expr1011  TI(nvarchar collate 872468488,Null,Var,Trim,ML=200)] 000001373B8677E0  N 

                                [CALC:COL: Expr1012  TI(nvarchar collate 872468488,Null,Var,Trim,ML=200)] 000001373B8678E0  N 

                                [CALC:COL: Expr1013  TI(int,Null,ML=4)] 000001373B8679D0  N 

              Constant Scan Constant Scan (7) 

          Compute Scalar Compute Scalar (8) 

                              [CALC:COL: Expr1022  TI(nvarchar collate 872468488,Null,Var,Trim,ML=400)] 000001373B86A790  N 

                              [CALC:COL: Expr1023  TI(nvarchar collate 872468488,Null,Var,Trim,ML=400)] 000001373B86A890  N 

                              [CALC:COL: Expr1016  TI(int,Null,ML=4)] 000001373B86A3C0  N 

            Compute Scalar Compute Scalar (9) 

                                [CALC:COL: Expr1014  TI(nvarchar collate 872468488,Null,Var,Trim,ML=200)] 000001373B86A1D0  N 

                                [CALC:COL: Expr1015  TI(nvarchar collate 872468488,Null,Var,Trim,ML=200)] 000001373B86A2C0  N 

                                [CALC:COL: Expr1016  TI(int,Null,ML=4)] 000001373B86A3C0  N 

              Constant Scan Constant Scan (10) 

          Compute Scalar Compute Scalar (11) 

                              [CALC:COL: Expr1024  TI(nvarchar collate 872468488,Null,Var,Trim,ML=400)] 000001373B86B0F0  N 

                              [CALC:COL: Expr1025  TI(nvarchar collate 872468488,Null,Var,Trim,ML=400)] 000001373B86B1E0  N 

                              [CALC:COL: Expr1019  TI(int,Null,ML=4)] 000001373B86AD20  N 

            Compute Scalar Compute Scalar (12) 

                                [CALC:COL: Expr1017  TI(nvarchar collate 872468488,Null,Var,Trim,ML=200)] 000001373B86AB30  N 

                                [CALC:COL: Expr1018  TI(nvarchar collate 872468488,Null,Var,Trim,ML=200)] 000001373B86AC20  N 

                                [CALC:COL: Expr1019  TI(int,Null,ML=4)] 000001373B86AD20  N 

              Constant Scan Constant Scan (13) 

  Index Seek Index Seek (14)  CP 

                      [QCOL: [TestDatabase].[dbo].[like_ma_n].col1 TI(nvarchar collate 872468488,Null,Var,Trim,ML=400)] 000001373B86F310  P 

End of query plan compilation, time: 0.063 net: 0.002 total: 0.521 net: 0.017

now here you can see that there is not much difference except that it created new operators to merge intervals using

concatenation

compute scalar

sort

then merge

and you can see each of the expressions here

and now we are done here, see you in the next blog

Leave a Reply

Your email address will not be published. Required fields are marked *