DWH basics – partitioning

Posted: November 3, 2012 in MSSQL
Tags:
Task: to show how to horizontal partition a database table and how to apply different data compression to this partitions


SQL Server supports only one type of horizontal partitioning, the range partitions. It is the partitioning strategy in which data is partitioned based on the range that the value of a particular field falls in.

Figure 1: UML – Use Case

Figure 1: UML – Use Case

The process of horizontal partitioning of a database table and data compression consists of the following steps:
1. Create a new test database with two different filegroups (NOTE: the path C:\Temp\SQL Server\Partitioning\Primary and Secondary must already exist!)


IF EXISTS(
SELECT name
FROM sys.databases
WHERE name = N'Test_Partitioning')
DROP DATABASE Test_Partitioning
GO
CREATE DATABASE Test_Partitioning
ON PRIMARY
(NAME = 'TestPartitioning_Part2010',
FILENAME =
'C:\Temp\SQL Server\Partitioning\Primary\TestPartitioning_Part2010.mdf',
SIZE = 5, -- ISSUE: The CREATE DATABASE statement failed. The primary file
-- must be at least 5 MB to accommodate a copy of the model database.
MAXSIZE = 100,
FILEGROWTH = 1),
FILEGROUP TestPartitioning_Part2011
(NAME = 'TestPartitioning_Part2011',
FILENAME =
'C:\Temp\SQL Server\Partitioning\Secondary\TestPartitioning_Part2011.ndf',
SIZE = 2,
MAXSIZE = 100,
FILEGROWTH = 1),
FILEGROUP TestPartitioning_Part2012
(NAME = 'TestPartitioning_Part2012',
FILENAME =
'C:\Temp\SQL Server\Partitioning\Secondary\TestPartitioning_Part2012.ndf',
SIZE = 2,
MAXSIZE = 100,
FILEGROWTH = 1)
GO

2. Create the partition range function


USE Test_Partitioning
GO
CREATE PARTITION FUNCTION TestPartitioning_PartitionFunction(DATE)
AS RANGE LEFT FOR
VALUES('2010-12-31', '2012-01-01')
-- NOTE: interval: (- infinity, 2010-12-31],
--                 [2011-01-01, 2011-12-31] and
--                 [2012-01-01, infinity +)
GO

3. Create the partition scheme and attach the partition scheme to filegroups


USE Test_Partitioning
GO
CREATE PARTITION SCHEME TestPartitioning_PartitionScheme
AS PARTITION TestPartitioning_PartitionFunction
TO([PRIMARY], TestPartitioning_Part2011, TestPartitioning_Part2012)
GO

4. Create a table with the partition key and the partition scheme


USE Test_Partitioning
GO
CREATE TABLE dbo.Test_PartitionedTable
(ID INT NOT NULL,
[Date] DATE)
ON TestPartitioning_PartitionScheme([Date])
GO

5. Create the index on the partitioned table (optional and recommended)


USE Test_Partitioning
GO
CREATE UNIQUE CLUSTERED INDEX IX_TestPartition_Table
ON dbo.Test_PartitionedTable([Date])
ON TestPartitioning_PartitionScheme([Date])
GO

6. Insert data into the partitioned table


USE Test_Partitioning
GO
INSERT INTO dbo.Test_PartitionedTable(ID, [Date])
VALUES
(1, DATEADD(YEAR, -3, GETDATE())), -- inserted in the partition 2009
(2, DATEADD(YEAR, -2, GETDATE())), -- inserted in the partition 2010
(3, DATEADD(YEAR, -1, GETDATE())), -- inserted in the partition 2011
(4,                   GETDATE())   -- inserted in the partition 2012
GO

7. Test data from the dbo.Test_PartitionedTable


USE Test_Partitioning
GO
SELECT *
FROM dbo.Test_PartitionedTable
/*
SSMS RESULTS:
ID  Date
1   2009-10-19
2   2010-10-19
3   2011-10-19
4   2012-10-19
*/
GO

8. Verify rows inserted in partitions


USE Test_Partitioning
GO
SELECT *
FROM sys.partitions
WHERE OBJECT_NAME(OBJECT_ID) = 'Test_PartitionedTable'
ORDER BY partition_number
/*
SSMS RESULTS:
partition_id        object_id   index_id   partition_number   hobt_id             rows   filestream_filegroup_id   data_compression   data_compression_desc 
72057594039238656   245575913	1	   1	              72057594039238656   2	 0	                   0	              NONE
72057594039304192   245575913	1	   2	              72057594039304192	  1	 0	                   0	              NONE
72057594039369728   245575913	1	   3	              72057594039369728	  1	 0	                   0	              NONE
*/
GO

9. SPLIT – add the new partition P4 for the next year 2013


USE Test_Partitioning
GO
ALTER DATABASE Test_Partitioning
ADD FILEGROUP TestPartitioning_Part2013
GO
ALTER DATABASE Test_Partitioning
ADD FILE
(NAME = 'TestPartitioning_Part2013',
FILENAME =
'C:\Temp\SQL Server\Partitioning\Secondary\TestPartitioning_Part2013.ndf',
SIZE = 2,
MAXSIZE = 100,
FILEGROWTH = 1)
TO FILEGROUP TestPartitioning_Part2013
GO
ALTER PARTITION SCHEME TestPartitioning_PartitionScheme
NEXT USED TestPartitioning_Part2013
GO
ALTER PARTITION FUNCTION TestPartitioning_PartitionFunction()
SPLIT RANGE('2013-01-01')
-- NOTE: interval: (- infinity, 2010-12-31],
--                 [2011-01-01, 2011-12-31],
--                 [2012-01-01, 2012-12-31] and
--                 [2013-01-01, infinity +)
GO

10. SWITCH IN – move data from the staging table to the empty newly added partition P4


USE Test_Partitioning
GO

--INSERT INTO dbo.Test_PartitionedTable(ID, [Date])
--VALUES
--(5, DATEADD(YEAR,  1, GETDATE())), -- inserted in the partition 2013
--(6, DATEADD(YEAR,  2, GETDATE()))  -- inserted in the partition 2014
--GO

IF EXISTS(
SELECT name
FROM sys.databases
WHERE name = N'Test_Partitioning_Staging')
DROP DATABASE Test_Partitioning_Staging
GO
CREATE DATABASE Test_Partitioning_Staging
ON PRIMARY
(NAME = 'Test_Partitioning_Staging',
FILENAME =
'C:\Temp\SQL Server\Partitioning\Primary\Test_Partitioning_Staging.mdf',
SIZE = 5, -- ISSUE: The CREATE DATABASE statement failed. The primary file
-- must be at least 5 MB to accommodate a copy of the model database.
MAXSIZE = 100,
FILEGROWTH = 1)
GO


USE Test_Partitioning_Staging
GO

CREATE TABLE dbo.Test_NonPartitionedStagingTable
(ID INT NOT NULL,
[Date] DATE)
GO
CREATE UNIQUE CLUSTERED INDEX IX_Test_NonPartitionedStagingTable
ON dbo.Test_NonPartitionedStagingTable([Date])
GO
INSERT INTO dbo.Test_NonPartitionedStagingTable(ID, [Date])
VALUES
(5, DATEADD(YEAR,  1, GETDATE())), -- inserted in the partition 2013
(6, DATEADD(YEAR,  2, GETDATE()))  -- inserted in the partition 2014
/*
SSMS RESULTS:
ID  Date
5   2013-10-31
6   2014-10-31
*/
GO


USE Test_Partitioning
GO

ALTER DATABASE Test_Partitioning
MODIFY FILEGROUP TestPartitioning_Part2012 DEFAULT
GO

CREATE TABLE dbo.Test_NonPartitionedStagingTemp
(ID INT NOT NULL,
[Date] DATE)
GO
CREATE UNIQUE CLUSTERED INDEX IX_Test_NonPartitionedStagingTemp
ON dbo.Test_NonPartitionedStagingTemp([Date])
GO
ALTER TABLE dbo.Test_NonPartitionedStagingTemp
ADD CONSTRAINT PartitionFunction CHECK ([Date] > '2013-01-01'
AND [Date] IS NOT NULL)
GO
INSERT INTO dbo.Test_NonPartitionedStagingTemp(ID, [Date])
SELECT ID, [Date]
FROM Test_Partitioning_Staging.dbo.Test_NonPartitionedStagingTable
GO
SELECT * FROM dbo.Test_NonPartitionedStagingTemp
GO
/*
SSMS RESULTS:
ID  Date
5   2013-10-31
6   2014-10-31
*/

ALTER TABLE dbo.Test_NonPartitionedStagingTemp
SWITCH TO dbo.Test_PartitionedTable PARTITION 4
GO
DROP TABLE dbo.Test_NonPartitionedStagingTemp
GO

ALTER DATABASE Test_Partitioning
MODIFY FILEGROUP [PRIMARY] DEFAULT
GO

SELECT *
FROM dbo.Test_PartitionedTable
/*
SSMS RESULTS:
ID  Date
1   2009-10-19
2   2010-10-19
3   2011-10-19
4   2012-10-19
5   2013-10-19
6   2014-10-19
*/
GO
SELECT *
FROM sys.partitions
WHERE OBJECT_NAME(OBJECT_ID) = 'Test_PartitionedTable'
ORDER BY partition_number
/*
SSMS RESULTS:
partition_id        object_id   index_id   partition_number   hobt_id             rows   filestream_filegroup_id   data_compression   data_compression_desc 
72057594039238656   245575913	1	   1	              72057594039238656   2	 0	                   0	              NONE
72057594039304192   245575913	1	   2	              72057594039304192	  1	 0	                   0	              NONE
72057594039435264   245575913	1	   3	              72057594039435264	  1	 0	                   0                  NONE
72057594039369728   245575913	1	   4	              72057594039369728	  2	 0	                   0	              NONE
*/
GO

11. SWITCH OUT – move data from the first partition P1 to the archive table dbo.Test_NonPartitionedArchiveTable


USE Test_Partitioning
GO

CREATE TABLE dbo.Test_NonPartitionedArchiveTable
(ID INT NOT NULL,
[Date] DATE)
ON [PRIMARY]
GO
CREATE UNIQUE CLUSTERED INDEX IX_NonPartitioned_Archive
ON dbo.Test_NonPartitionedArchiveTable([Date])
GO

ALTER TABLE dbo.Test_PartitionedTable SWITCH PARTITION 1
TO dbo.Test_NonPartitionedArchiveTable
GO

SELECT *
FROM dbo.Test_NonPartitionedArchiveTable
/*
SSMS RESULTS:
ID  Date
1   2009-10-19
2   2010-10-19
*/
GO
SELECT *
FROM sys.partitions
WHERE OBJECT_NAME(OBJECT_ID) = 'Test_PartitionedTable'
ORDER BY partition_number
/*
SSMS RESULTS:
partition_id        object_id   index_id   partition_number   hobt_id             rows   filestream_filegroup_id   data_compression   data_compression_desc 
72057594039238656   245575913	1	   1	              72057594039238656   0	 0	                   0	              NONE
72057594039304192   245575913	1	   2	              72057594039304192	  1	 0	                   0	              NONE
72057594039435264   245575913	1	   3	              72057594039435264	  1	 0	                   0                  NONE
72057594039369728   245575913	1	   4	              72057594039369728	  2	 0	                   0	              NONE
*/
GO

12. MERGE – based on ‘2012-01-01’, i.e. P1 = PRIMARY FG (2010) + SECONDARY FG (2011)


USE Test_Partitioning
GO

ALTER PARTITION FUNCTION TestPartitioning_PartitionFunction()
MERGE RANGE('2012-01-01')
-- NOTE: interval: (- infinity, 2010-12-31]
--                 [2011-01-01, 2011-12-31],
--                 [2012-01-01, 2012-12-31] and
--                 [2013-01-01, infinity +)
GO
SELECT *
FROM dbo.Test_PartitionedTable
/*
SSMS RESULTS:
ID  Date
3   2011-10-19
4   2012-10-19
5   2013-10-19
6   2014-10-19
*/
GO
SELECT *
FROM sys.partitions
WHERE OBJECT_NAME(OBJECT_ID) = 'Test_PartitionedTable'
ORDER BY partition_number
/*
SSMS RESULTS:
partition_id        object_id   index_id   partition_number   hobt_id             rows   filestream_filegroup_id   data_compression   data_compression_desc 
72057594039238656   245575913	1	   1	              72057594039238656   0	 0	                   0	              NONE
72057594039304192   245575913	1	   2	              72057594039304192	  2	 0	                   0	              NONE
72057594039369728   245575913	1	   3	              72057594039369728	  2	 0	                   0	              NONE
*/
GO

INSERT INTO dbo.Test_PartitionedTable(ID, [Date])
VALUES
(1, DATEADD(YEAR, -3, GETDATE())), -- inserted in the partition 2009
(2, DATEADD(YEAR, -2, GETDATE()))  -- inserted in the partition 2010
GO
SELECT *
FROM dbo.Test_PartitionedTable
/*
SSMS RESULTS:
ID  Date
1   2009-10-19
2   2010-10-19
3   2011-10-19
4   2012-10-19
5   2013-10-19
6   2014-10-19
*/
GO
SELECT *
FROM sys.partitions
WHERE OBJECT_NAME(OBJECT_ID) = 'Test_PartitionedTable'
ORDER BY partition_number
/*
SSMS RESULTS:
partition_id        object_id   index_id   partition_number   hobt_id             rows   filestream_filegroup_id   data_compression   data_compression_desc 
72057594039238656   245575913	1	   1	              72057594039238656   2	 0	                   0	              NONE
72057594039304192   245575913	1	   2	              72057594039304192	  2	 0	                   0	              NONE
72057594039369728   245575913	1	   3	              72057594039369728	  2	 0	                   0	              NONE
*/
GO

13. Apply compression to your partitioned table


USE Test_Partitioning
GO
ALTER TABLE Test_PartitionedTable
REBUILD PARTITION = All
WITH 
(
DATA_COMPRESSION = PAGE ON Partitions(1), -- NOTE: (1 TO 3)
DATA_COMPRESSION = ROW  ON Partitions(2) ,
DATA_COMPRESSION = NONE ON Partitions(3)
);
GO

14. Apply compression to your partitioned index


USE Test_Partitioning
GO
ALTER INDEX IX_TestPartition_Table
ON dbo.Test_PartitionedTable
REBUILD PARTITION  = All
WITH 
(
DATA_COMPRESSION = PAGE ON Partitions(1),
DATA_COMPRESSION = ROW  ON Partitions(2),
DATA_COMPRESSION = NONE ON Partitions(3)
);
GO

15. Apply compression to your unpartitioned index


--USE Test_Partitioning
--GO
--ALTER INDEX YourUnpartitionedIndex
--ON dbo.Test_PartitionedTable
--REBUILD WITH (DATA_COMPRESSION = ROW);
--GO

16. Testing of application three types of data compression


USE Test_Partitioning
GO
SELECT *
FROM sys.partitions
WHERE OBJECT_NAME(OBJECT_ID) = 'Test_PartitionedTable'
ORDER BY partition_number
/*
SSMS RESULTS:
partition_id        object_id   index_id   partition_number   hobt_id             rows   filestream_filegroup_id   data_compression   data_compression_desc 
72057594039238656   245575913	1	   1	              72057594039238656   2	 0	                   2	              PAGE
72057594039304192   245575913	1	   2	              72057594039304192	  2	 0	                   1	              ROW
72057594039369728   245575913	1	   3	              72057594039369728	  2	 0	                   0	              NONE
*/
GO


Recommendations:
  • filegroups can be setup as read-only. It will increase the performance
  • relational and cube partitions should be based on the same column
  • SQL Server 7.0 introduced partitioning through partitioned views
  • partitioning: when the number of rows is higher than 1M
  • roughly 25GB per partition is well manageable (100GB would be poorly manageable)
  • operations: SPLIT, MERGE and SWITCH are meta data operations only and do not involve any movement of data
  • usage: to migrate older data from more expensive disk to less expensive disk
Quotes from references:
  • “Though it is possible to create all partitions on PRIMARY but it would be best if these different partitions are stored in a separate file groups. This gives some performance improvement even in the case of single core computers. It would be best if these file groups are on different discs on a multi core processing machine.”
  • “A partitioned table may have a partitioned index. Partition aligned index views may also be created for this table.
    There may be question in your mind if it is possible to partition your table using multiple columns. The answer may be YES or NO. Why? No, because there is no such direct support for this in SQL Server. Yes, because you can still do that by using persisted computed column based on any number of columns you want.”
  • “For partitioning your existing table just drop the clustered index on your table and recreate it on the required partition scheme.”
  • “Though SQL Server does not directly support List Partitioning, you can create list partitions by tricking the partition function to specify the values with the LEFT clause. After that, put a CHECK constraint on the table, so that no other values are allowed to be inserted in the table specifying Partition Key column any value other than the ‘list’ of values.”
  • “It must be remembered that indexes can be partitioned using a different partition key than the table. The index can also have different numbers of partitions than the table. We cannot, however, partition the clustered index differently from the table. To partition an index, ON clause is used, specifying the partition scheme along with the column when creating the index:
    If your table and index use the same Partition function then they are called Aligned. If they go further and also use the same partition scheme as well, then they are called Storage Aligned (note this in the figure below). If you use the same partition function for partitioning index as used by the table, then generally performance is improved.”
Additional references:
Advertisements

DWH basics – columnstore index for fast DW

Posted: October 28, 2012 in MSSQL
Tags:
Task: to make performance tests to show a performance impact of the columnstore index (SQL Server 2012)


Recommendations:
  • to use on very large tables only (77 628 600 rows in my example; fact tables are good candidates)
  • when the columnstore index is created, the table becomes read-only (despite of this fact, update/insert/delete operations are possible, e.g. by means of disabling the columnstore index)

The process of testing a performance impact of the columnstore index consists of the following steps:
1. Create the testing table


USE [AdventureWorksDW2012]
GO

CREATE TABLE [dbo].[MyFactProductInventory](
  [ProductKey] [int] NOT NULL,
  [DateKey] [int] NOT NULL,
  [MovementDate] [date] NOT NULL,
  [UnitCost] [money] NOT NULL,
  [UnitsIn] [int] NOT NULL,
  [UnitsOut] [int] NOT NULL,
  [UnitsBalance] [int] NOT NULL,
) ON [PRIMARY]
GO

2. Create the clustered regular index


CREATE CLUSTERED INDEX [IX_MyFactProductInventory_Clustered]
ON [dbo].[MyFactProductInventory]([ProductKey])
GO

3. Create the sample data table (WARNING: this query may run up to 2-10 minutes based on your systems resources)


INSERT INTO [dbo].[MyFactProductInventory]
SELECT [Fact].* FROM [dbo].[FactProductInventory] AS [Fact]
GO 100

4. Performance tests (NOTE: comparing the regular (clustered / non-clustered) index with the columnstore index)


SET STATISTICS IO ON
GO
SET STATISTICS TIME ON
GO

5. Select the table with the clustered regular index


SELECT ProductKey, SUM(UnitCost) AS 'SumUnitCost',
  AVG(UnitsBalance) AS 'AvgUnitsBalance'
FROM [dbo].[MyFactProductInventory]
GROUP BY ProductKey
ORDER BY ProductKey
GO
/* First testing results: */
-- (606 row(s) affected)
-- Table 'MyFactProductInventory'. Scan count 5, logical reads 602219,
-- physical reads 81, read-ahead reads 379178, lob logical reads 0,
-- lob physical reads 0, lob read-ahead reads 0.
-- SQL Server Execution Times: CPU time = 23665 ms, elapsed time = 18398 ms.
/* Second testing results: */
-- (606 row(s) affected)
-- Table 'MyFactProductInventory'. Scan count 5, logical reads 601939,
-- physical reads 0, read-ahead reads 602119, lob logical reads 0,
-- lob physical reads 0, lob read-ahead reads 0.
-- SQL Server Execution Times: CPU time = 20670 ms, elapsed time = 22654 ms.
/* Third testing results: */
-- (606 row(s) affected)
-- Table 'MyFactProductInventory'. Scan count 5, logical reads 602244,
-- physical reads 0, read-ahead reads 598015, lob logical reads 0,
-- lob physical reads 0, lob read-ahead reads 0.
-- SQL Server Execution Times: CPU time = 21123 ms, elapsed time = 20832 ms.

6.1. Create the non-clustered regular index


CREATE NONCLUSTERED INDEX [IX_MyFactProductInventory_Nonclustered]
ON [MyFactProductInventory](UnitCost, UnitsBalance, ProductKey)
GO

7.1. Select the table with the non-clustered regular index


SELECT ProductKey, SUM(UnitCost) AS 'SumUnitCost',
  AVG(UnitsBalance) AS 'AvgUnitsBalance'
FROM [dbo].[MyFactProductInventory]
GROUP BY ProductKey
ORDER BY ProductKey
GO
/* First testing results: */
-- (606 row(s) affected)
-- Table 'MyFactProductInventory'. Scan count 5, logical reads 290552,
-- physical reads 0, read-ahead reads 32391, lob logical reads 0,
-- lob physical reads 0, lob read-ahead reads 0.
-- SQL Server Execution Times: CPU time = 31323 ms, elapsed time = 8665 ms.
/* Second testing results: */
-- (606 row(s) affected)
-- Table 'MyFactProductInventory'. Scan count 5, logical reads 290652,
-- physical reads 0, read-ahead reads 0, lob logical reads 0,
-- lob physical reads 0, lob read-ahead reads 0.
-- SQL Server Execution Times: CPU time = 32448 ms, elapsed time = 8428 ms.
/* Third testing results: */
-- (606 row(s) affected)
-- Table 'MyFactProductInventory'. Scan count 5, logical reads 290597,
-- physical reads 0, read-ahead reads 0, lob logical reads 0,
-- lob physical reads 0, lob read-ahead reads 0.
-- SQL Server Execution Times: CPU time = 31138 ms, elapsed time = 8351 ms.
ALTER INDEX [IX_MyFactProductInventory_Nonclustered]
ON [MyFactProductInventory] DISABLE
GO

6.2. Create the columnstore index


CREATE NONCLUSTERED COLUMNSTORE INDEX [IX_MyFactProductInventory_ColumnStore]
ON [MyFactProductInventory](UnitCost, UnitsBalance, ProductKey)
GO

7.2 Select the table with the columnstore index


SELECT ProductKey, SUM(UnitCost) AS 'SumUnitCost',
  AVG(UnitsBalance) AS 'AvgUnitsBalance'
FROM [dbo].[MyFactProductInventory]
GROUP BY ProductKey
ORDER BY ProductKey
GO
/* First testing results: */
-- (606 row(s) affected)
-- Table 'MyFactProductInventory'. Scan count 4, logical reads 12497,
-- physical reads 4, read-ahead reads 19973, lob logical reads 0,
-- lob physical reads 0, lob read-ahead reads 0.
-- SQL Server Execution Times: CPU time = 874 ms, elapsed time = 559 ms.
/* Second testing results: */
-- (606 row(s) affected)
-- Table 'MyFactProductInventory'. Scan count 4, logical reads 12320,
-- physical reads 0, read-ahead reads 4853, lob logical reads 0,
-- lob physical reads 0, lob read-ahead reads 0.
-- SQL Server Execution Times: CPU time = 1043 ms, elapsed time = 363 ms.
/* Third testing results: */
-- (606 row(s) affected)
-- Table 'MyFactProductInventory'. Scan count 4, logical reads 12370,
-- physical reads 0, read-ahead reads 9278, lob logical reads 0,
-- lob physical reads 0, lob read-ahead reads 0.
-- SQL Server Execution Times: CPU time = 874 ms, elapsed time = 435 ms.

8. Cleanup


DROP INDEX [IX_MyFactProductInventory_ColumnStore]
ON [dbo].[MyFactProductInventory]
GO
TRUNCATE TABLE [dbo].[MyFactProductInventory]
GO
DROP TABLE [dbo].[MyFactProductInventory]
GO
SET STATISTICS TIME OFF
GO
SET STATISTICS IO OFF
GO

Conclusion: the above results clearly show that the performance of the query is extremely fast after creating the columnstore index
  • for logical reads: approximately 48 times faster than the clustered and 23 times faster than the non-clustered index
  • for the elapsed time: approximately 33 times faster than the clustered and 16 times faster than the non-clustered index


Additional references:
Task: to show how simply to hack a system stored procedure in SQL Server 2005 and higher by means of DAC (Dedicated Administrator Connection) and resource database (the read-only mssqlsystemresource database)




WARNING: by hacking any part of SQL Server, e.g. a system stored procedure, you will lose the warranty from Microsoft. This blog post has the educational purpose only. For testing purposes, you should try it on the development machine only. I do not take any responsibility for any damages caused by this article.

Recommendations:
  • make a backup of the mssqlsystemresource database
  • if necessary, add “Full control” permissions for the user of “NT Service\MSSLQSERVER” (the account of SQL Server (MSSQLSERVER) service) on “C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\Binn\mssqlsystemresource.ldf and mssqlsystemresource.mdf”
  • run the cmd.exe as an administrator

The process of altering, i.e. hacking, e.g. of the system stored procedure sys.sp_who2 consists of the following steps:
1. Windows > Run => RUN A NEW CMD WINDOW


net stop mssqlserver
net start mssqlserver
cd C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\Binn
sqlcmd -S .\
sys.sp_who2
go
exit
exit @echo CLOSE THE CMD WINDOW

2. Windows > Run => RUN A NEW CMD WINDOW


net stop mssqlserver
cd C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\Binn
sqlservr -s mssqlserver -m
@echo DO NOT CLOSE THE CMD WINDOW

3. Windows > Run => RUN A NEW CMD WINDOW


cd C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\Binn
sqlcmd -S .\ -A
use mssqlsystemresource
go
alter database mssqlsystemresource set read_write
go
sp_helptext 'sys.sp_who2'
go
alter procedure sys.sp_who2 as select 'hacked procedure'
go
sp_helptext 'sys.sp_who2'
go
alter database mssqlsystemresource set read_only
go
exit
exit @echo CLOSE ALL RUNNING CMD WINDOWS

4. Windows > Run => RUN A NEW CMD WINDOW


net start mssqlserver
cd C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\Binn
sqlcmd -S .\
sys.sp_who2
go
exit
exit @echo CLOSE THE CMD WINDOW


Abbreviations:
  • -A > dedicated admin connection
  • -S > server
  • -m > single user admin mode
  • -s > name (alternate registry key name)
  • .\ > localhost and default instance


Figure 1: SQLCMD result – the procedure sys.sp_who2 before hacking

Figure 1: SQLCMD result – the procedure sys.sp_who2 before hacking


Figure 2: SQLCMD result – the procedure sys.sp_who2 after hacking

Figure 2: SQLCMD result – the procedure sys.sp_who2 after hacking


Additional references:
Task: to create a SSIS package to incremental data loading by means of Change Tracking



The process of creating a SSIS solution consists of the following steps:
1. creating of a SSDT (SQL Server Data Tools) SSIS project
2. enabling of the ‘CHANGE TRACKING’ feature on a database and table level
3. creating of the ‘MySettings’ table and inserting a default value into the ‘LastVersion’ column
4. creating of the ‘Version_Get’ stored procedure
5. getting and setting of the ‘@LastVersion’ and ‘@CurrentVersion’ variables
6. creating of the FAKE DWH ‘BusinessEntity_DWH’ table
7. FAKE inserting into the ‘BusinessEntity’ table
8. incremental data loading into the fake DWH ‘BusinessEntity_DWH’ table (the core of ‘CHANGE TRACKING’)
9. saving of the ‘@CurrentVersion’ variable into the ‘LastVersion’ column
10. disabling of the ‘CHANGE TRACKING’ feature on a table and database level; deleting of created objects

Change Tracking – Control Flow

Figure 1: Change Tracking – Control Flow


Change Tracking – Data Flow

Figure 2: Change Tracking – Data Flow


Source code: SSIS Package (SQL Server 2012 – SSDT)
Note: do not forget to customize the ConnectionString in the Project.params setting

Additional references:
Task: to create a SSIS package to delete duplicates by means of Fuzzy logic



The process of creating a SSIS solution consists of the following steps:
1. creating of a SSDT (SQL Server Data Tools) SSIS project
2. defining of connection to data sources
3. defining of connection to saving duplicate records
4. defining of connection to saving error messages
5. defining of connection to a database
6. creating of a table in a target database
7. creating of a Data Flow diagram
8. removing of exact duplicates
9. removing of “similar” duplicates by means of Fuzzy logic
10. saving of retrieved duplicates

Deleting duplicates by means of Fuzzy logic

Figure 1: Deleting duplicates by means of Fuzzy logic


Source code: SSIS Package (SQL Server 2012 – SSDT)
Note: do not forget to customize the Connection Managers in solution setting

Additional references:
Task: to give a deeper look at the work of the query optimizer with the query hint OPTION (OPTIMIZE FOR ()) in SQL Server



Indices represent one of the best performance techniques. The operator seek is being used on large tables (the overhead by using B-Tree on small tables leads to poor performance). The operator scan is being used on small tables. An improper using of indices can have a negative impact on the performance.
The query optimizer usually selects the optimal execution plan with regard to statistical data. For this reason, the using of query hints is recommended only for advanced database developers or DBAs.


/* STEP 1 – BEGIN */
/* PREPARE – WITHOUT INDICES (BEGIN) */
USE [Test]
GO

IF EXISTS (SELECT * FROM sys.objects WHERE object_id =
        OBJECT_ID(N'[dbo].[QueryOptimizer]') AND type in (N'U'))
BEGIN
	DROP TABLE [dbo].[QueryOptimizer]
END
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

SET ANSI_PADDING ON
GO

CREATE TABLE [dbo].[QueryOptimizer](
	[TerritoryID] [int] IDENTITY(1,1) NOT NULL,
	[Name] [varchar](50) NOT NULL,
	[CountryRegionCode] [varchar](3) NOT NULL
) ON [PRIMARY]
GO

SET ANSI_PADDING OFF
GO

SET NOCOUNT ON
GO
/* PREPARE – WITHOUT INDICES (END) */


/* CLEAN (BEGIN) */
TRUNCATE TABLE dbo.QueryOptimizer
GO
/* CLEAN (END) */
/* STEP 1 – END */


/* TEST 1: WITHOUT INDICES – SMALL TABLE: SELECTIVITY (BEGIN) */
/* STEP 2 – BEGIN */
-- Execution Plan: OFF

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Northwest', 'US')
	SET @rows += 1
END
GO

DECLARE @rows INT = 1
WHILE @rows < 2 -- 1 row
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Canada', 'CA')
	SET @rows += 1
END
GO 10 -- 10 * 1 rows
/* STEP 2 – END */

/* STEP 3 – BEGIN */
-- Execution Plan: ON
SET STATISTICS IO ON
GO
SET STATISTICS TIME ON
GO

SELECT *
FROM dbo.QueryOptimizer
WHERE [CountryRegionCode] = 'US'
-- US: NONSELECTIVE VALUE, TABLE SCAN, 1000 of 1010 rows
GO
/*
Table 'QueryOptimizer'. Scan count 1, logical reads 4,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/

SELECT *
FROM dbo.QueryOptimizer
WHERE [CountryRegionCode] = 'CA'
-- CA: SELECTIVE VALUE, TABLE SCAN, 10 of 1010 rows
GO
/*
Table 'QueryOptimizer'. Scan count 1, logical reads 4,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/

SET STATISTICS IO OFF
GO
SET STATISTICS TIME OFF
GO
/* STEP 3 – END */
/* TEST 1: WITHOUT INDICES – SMALL TABLE: SELECTIVITY (END) */


/* STEP 4 – BEGIN */
/* CLEAN (BEGIN) */
TRUNCATE TABLE dbo.QueryOptimizer
GO
/* CLEAN (END) */
/* STEP 4 – END */


/* TEST 2: WITHOUT INDICES – LARGE TABLE: SELECTIVITY (BEGIN) */
/* STEP 5 – BEGIN */
-- Execution Plan: OFF

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Northwest', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Northeast', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Central', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Southwest', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Southeast', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Southeast', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 2 -- 1 row
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Canada', 'CA')
	SET @rows += 1
END
GO 10 -- 10 * 1 rows
/* STEP 5 – END */

/* STEP 6 – BEGIN */
-- Execution Plan: ON
SET STATISTICS IO ON
GO
SET STATISTICS TIME ON
GO

SELECT *
FROM dbo.QueryOptimizer
WHERE [CountryRegionCode] = 'US'
-- US: NONSELECTIVE VALUE, TABLE SCAN, 60000 of 60010 rows
/*
Table 'QueryOptimizer'. Scan count 1, logical reads 231,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/
GO

SELECT *
FROM dbo.QueryOptimizer
WHERE [CountryRegionCode] = 'CA'
-- CA: SELECTIVE VALUE, TABLE SCAN, 10 of 60010 rows
/*
Table 'QueryOptimizer'. Scan count 1, logical reads 231,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/
GO

SET STATISTICS IO OFF
GO
SET STATISTICS TIME OFF
GO
/* STEP 6 – END */
/* TEST 2: WITHOUT INDICES – LARGE TABLE: SELECTIVITY (END) */

Source code 1: Tests without using indices


SSMS result – TEST 1: WITHOUT INDICES – SMALL TABLE: SELECTIVITY (US: LOW / CA: HIGH)

Figure 1: SSMS result – TEST 1: WITHOUT INDICES – SMALL TABLE: SELECTIVITY (US: LOW / CA: HIGH)


SSMS result – TEST 2: WITHOUT INDICES – LARGE TABLE: SELECTIVITY (US: LOW / CA: HIGH)

Figure 1: SSMS result – TEST 2: WITHOUT INDICES – LARGE TABLE: SELECTIVITY (US: LOW / CA: HIGH)



/* STEP 1 – BEGIN */
/* PREPARE – WITH INDICES (BEGIN) */
USE [Test]
GO

IF EXISTS (SELECT * FROM sys.objects WHERE object_id =
        OBJECT_ID(N'[dbo].[QueryOptimizer]') AND type in (N'U'))
BEGIN
	DROP TABLE [dbo].[QueryOptimizer]
END
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

SET ANSI_PADDING ON
GO

CREATE TABLE [dbo].[QueryOptimizer](
	[TerritoryID] [int] IDENTITY(1,1) NOT NULL,
	[Name] [varchar](50) NOT NULL,
	[CountryRegionCode] [varchar](3) NOT NULL,
 CONSTRAINT [PK_QueryOptimizer] PRIMARY KEY CLUSTERED 
(
	[TerritoryID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [IX_QueryOptimizer_CountryRegionCode]
        ON [dbo].[QueryOptimizer]
(
	[CountryRegionCode] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

SET ANSI_PADDING OFF
GO

SET NOCOUNT ON
GO
/* PREPARE – WITH INDICES (END) */


/* CLEAN (BEGIN) */
TRUNCATE TABLE dbo.QueryOptimizer
GO
/* CLEAN (END) */
/* STEP 1 – END */


/* TEST 1: WITH INDICES – SMALL TABLE: SELECTIVITY (BEGIN) */
/* STEP 2 – BEGIN */
-- Execution Plan: OFF

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Northwest', 'US')
	SET @rows += 1
END
GO

DECLARE @rows INT = 1
WHILE @rows < 2 -- 1 row
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Canada', 'CA')
	SET @rows += 1
END
GO 10 -- 10 * 1 rows
/* STEP 2 – END */

/* STEP 3 – BEGIN */
-- Execution Plan: ON
SET STATISTICS IO ON
GO
SET STATISTICS TIME ON
GO

SELECT *
FROM dbo.QueryOptimizer
WHERE [CountryRegionCode] = 'US'
-- US: NONSELECTIVE VALUE, INDEX SCAN (Clustered), 1000 of 1010 rows
GO
/*
Table 'QueryOptimizer'. Scan count 1, logical reads 6,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/

SELECT *
FROM dbo.QueryOptimizer
WHERE [CountryRegionCode] = 'CA'
-- CA: SELECTIVE VALUE, INDEX SCAN (Clustered), 10 of 1010 rows
GO
/*
Table 'QueryOptimizer'. Scan count 1, logical reads 6,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/

SET STATISTICS IO OFF
GO
SET STATISTICS TIME OFF
GO
/* STEP 3 – END */
/* TEST 1: WITH INDICES – SMALL TABLE: SELECTIVITY (END) */


/* STEP 4 – BEGIN */
/* CLEAN (BEGIN) */
TRUNCATE TABLE dbo.QueryOptimizer
GO
/* CLEAN (END) */
/* STEP 4 – END */


/* TEST 2: WITH INDICES – LARGE TABLE: SELECTIVITY (BEGIN) */
/* STEP 5 – BEGIN */
-- Execution Plan: OFF

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Northwest', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Northeast', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Central', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Southwest', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Southeast', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Southeast', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 2 -- 1 row
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Canada', 'CA')
	SET @rows += 1
END
GO 10 -- 10 * 1 rows
/* STEP 5 – END */

/* STEP 6 – BEGIN */
-- Execution Plan: ON
SET STATISTICS IO ON
GO
SET STATISTICS TIME ON
GO

SELECT *
FROM dbo.QueryOptimizer
WHERE [CountryRegionCode] = 'US'
-- US: NONSELECTIVE VALUE, INDEX SCAN (Clustered), 60000 of 60010 rows
GO
/*
Table 'QueryOptimizer'. Scan count 1, logical reads 223,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/

SELECT *
FROM dbo.QueryOptimizer
WHERE [CountryRegionCode] = 'CA'
-- CA: SELECTIVE VALUE, INDEX SEEK (NonClustered), 10 of 60010 rows
GO
/*
Table 'QueryOptimizer'. Scan count 1, logical reads 22,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/

SET STATISTICS IO OFF
GO
SET STATISTICS TIME OFF
GO
/* STEP 6 – END */
/* TEST 2: WITH INDICES – LARGE TABLE: SELECTIVITY (END) */

Source code 2: Tests with using indices


SSMS result – TEST 1: WITH INDICES – SMALL TABLE: SELECTIVITY (US: LOW / CA: HIGH)

Figure 2: SSMS result – TEST 1: WITH INDICES – SMALL TABLE: SELECTIVITY (US: LOW / CA: HIGH)


SSMS result – TEST 2: WITH INDICES – LARGE TABLE: SELECTIVITY (US: LOW / CA: HIGH)

Figure 2: SSMS result – TEST 2: WITH INDICES – LARGE TABLE: SELECTIVITY (US: LOW / CA: HIGH)



/* STEP 1 – BEGIN */
/* PREPARE – WITH PRIMARY KEY INDEX ONLY (BEGIN) */
USE [Test]
GO

IF EXISTS (SELECT * FROM sys.objects WHERE object_id =
        OBJECT_ID(N'[dbo].[QueryOptimizer]') AND type in (N'U'))
BEGIN
	DROP TABLE [dbo].[QueryOptimizer]
END
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

SET ANSI_PADDING ON
GO

CREATE TABLE [dbo].[QueryOptimizer](
	[TerritoryID] [int] IDENTITY(1,1) NOT NULL,
	[Name] [varchar](50) NOT NULL,
	[CountryRegionCode] [varchar](3) NOT NULL,
 CONSTRAINT [PK_QueryOptimizer] PRIMARY KEY CLUSTERED 
(
	[TerritoryID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

SET ANSI_PADDING OFF
GO

SET NOCOUNT ON
GO
/* PREPARE - WITH PRIMARY KEY INDEX ONLY (END) */


/* CLEAN (BEGIN) */
TRUNCATE TABLE dbo.QueryOptimizer
GO
/* CLEAN (END) */
/* STEP 1 – END */


/* TEST 1: WITH PRIMARY KEY INDEX ONLY – SMALL TABLE: SELECTIVITY (BEGIN) */
/* STEP 2 – BEGIN */
-- Execution Plan: OFF

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Northwest', 'US')
	SET @rows += 1
END
GO

DECLARE @rows INT = 1
WHILE @rows < 2 -- 1 row
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Canada', 'CA')
	SET @rows += 1
END
GO 10 -- 10 * 1 rows
/* STEP 2 – END */

/* STEP 3 – BEGIN */
-- Execution Plan: ON
SET STATISTICS IO ON
GO
SET STATISTICS TIME ON
GO

SELECT *
FROM dbo.QueryOptimizer
WHERE [CountryRegionCode] = 'US'
-- US: NONSELECTIVE VALUE, INDEX SCAN (Clustered), 1000 of 1010 rows
GO
/*
Table 'QueryOptimizer'. Scan count 1, logical reads 6,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/

SELECT *
FROM dbo.QueryOptimizer
WHERE [CountryRegionCode] = 'CA'
-- CA: SELECTIVE VALUE, INDEX SCAN (Clustered), 10 of 1010 rows
GO
/*
Table 'QueryOptimizer'. Scan count 1, logical reads 6,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/

SET STATISTICS IO OFF
GO
SET STATISTICS TIME OFF
GO
/* STEP 3 – END */
/* TEST 1: WITH PRIMARY KEY INDEX ONLY – SMALL TABLE: SELECTIVITY (END) */


/* STEP 4 – BEGIN */
/* CLEAN (BEGIN) */
TRUNCATE TABLE dbo.QueryOptimizer
GO
/* CLEAN (END) */
/* STEP 4 – END */


/* TEST 2: WITH PRIMARY KEY INDEX ONLY – LARGE TABLE: SELECTIVITY (BEGIN) */
/* STEP 5 – BEGIN */
-- Execution Plan: OFF

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Northwest', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Northeast', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Central', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Southwest', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Southeast', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Southeast', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 2 -- 1 row
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Canada', 'CA')
	SET @rows += 1
END
GO 10 -- 10 * 1 rows
/* STEP 5 – END */

/* STEP 6 – BEGIN */
-- Execution Plan: ON
SET STATISTICS IO ON
GO
SET STATISTICS TIME ON
GO

SELECT *
FROM dbo.QueryOptimizer
WHERE [CountryRegionCode] = 'US'
-- US: NONSELECTIVE VALUE, INDEX SCAN (Clustered), 60000 of 60010 rows
GO
/*
Table 'QueryOptimizer'. Scan count 1, logical reads 223,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/

SELECT *
FROM dbo.QueryOptimizer
WHERE [CountryRegionCode] = 'CA'
-- CA: SELECTIVE VALUE, INDEX SCAN (Clustered), 10 of 60010 rows
GO
/*
Table 'QueryOptimizer'. Scan count 1, logical reads 223,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/

SET STATISTICS IO OFF
GO
SET STATISTICS TIME OFF
GO
/* STEP 6 – END */
/* TEST 2: WITH PRIMARY KEY INDEX ONLY – LARGE TABLE: SELECTIVITY (END) */

Source code 3: Tests with primary key index only


SSMS result – TEST 1: WITH PRIMARY KEY INDEX ONLY – SMALL TABLE: SELECTIVITY (US: LOW / CA: HIGH)

Figure 3: SSMS result – TEST 1: WITH PRIMARY KEY INDEX ONLY – SMALL TABLE: SELECTIVITY (US: LOW / CA: HIGH)


SSMS result – TEST 2: WITH PRIMARY KEY INDEX ONLY – LARGE TABLE: SELECTIVITY (US: LOW / CA: HIGH)

Figure 3: SSMS result – TEST 2: WITH PRIMARY KEY INDEX ONLY – LARGE TABLE: SELECTIVITY (US: LOW / CA: HIGH)



/* STEP 1 – BEGIN */
/* PREPARE – WITH NONCLUSTERED INDEX ONLY (BEGIN) */
USE [Test]
GO

IF EXISTS (SELECT * FROM sys.objects WHERE object_id =
        OBJECT_ID(N'[dbo].[QueryOptimizer]') AND type in (N'U'))
BEGIN
	DROP TABLE [dbo].[QueryOptimizer]
END
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

SET ANSI_PADDING ON
GO

CREATE TABLE [dbo].[QueryOptimizer](
	[TerritoryID] [int] IDENTITY(1,1) NOT NULL,
	[Name] [varchar](50) NOT NULL,
	[CountryRegionCode] [varchar](3) NOT NULL
) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [IX_QueryOptimizer_CountryRegionCode]
        ON [dbo].[QueryOptimizer]
(
	[CountryRegionCode] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

SET ANSI_PADDING OFF
GO

SET NOCOUNT ON
GO
/* PREPARE – WITH NONCLUSTERED INDEX ONLY (END) */


/* CLEAN (BEGIN) */
TRUNCATE TABLE dbo.QueryOptimizer
GO
/* CLEAN (END) */
/* STEP 1 – END */


/* TEST 1: WITH NONCLUSTERED INDEX ONLY – SMALL TABLE: SELECTIVITY (BEGIN) */
/* STEP 2 – BEGIN */
-- Execution Plan: OFF

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Northwest', 'US')
	SET @rows += 1
END
GO

DECLARE @rows INT = 1
WHILE @rows < 2 -- 1 row
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Canada', 'CA')
	SET @rows += 1
END
GO 10 -- 10 * 1 rows
/* STEP 2 – END */

/* STEP 3 – BEGIN */
-- Execution Plan: ON
SET STATISTICS IO ON
GO
SET STATISTICS TIME ON
GO

SELECT *
FROM dbo.QueryOptimizer
WHERE [CountryRegionCode] = 'US'
-- US: NONSELECTIVE VALUE, TABLE SCAN, 1000 of 1010 rows
GO
/*
Table 'QueryOptimizer'. Scan count 1, logical reads 4,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/

SELECT *
FROM dbo.QueryOptimizer
WHERE [CountryRegionCode] = 'CA'
-- CA: SELECTIVE VALUE, TABLE SCAN, 10 of 1010 rows
GO
/*
Table 'QueryOptimizer'. Scan count 1, logical reads 4,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/

SET STATISTICS IO OFF
GO
SET STATISTICS TIME OFF
GO
/* STEP 3 – END */
/* TEST 1: WITH NONCLUSTERED INDEX ONLY – SMALL TABLE: SELECTIVITY (END) */


/* STEP 4 – BEGIN */
/* CLEAN (BEGIN) */
TRUNCATE TABLE dbo.QueryOptimizer
GO
/* CLEAN (END) */
/* STEP 4 – END */


/* TEST 2: WITH NONCLUSTERED INDEX ONLY – LARGE TABLE: SELECTIVITY (BEGIN) */
/* STEP 5 – BEGIN */
-- Execution Plan: OFF

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Northwest', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Northeast', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Central', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Southwest', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Southeast', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Southeast', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 2 -- 1 row
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Canada', 'CA')
	SET @rows += 1
END
GO 10 -- 10 * 1 rows
/* STEP 5 – END */

/* STEP 6 – BEGIN */
-- Execution Plan: ON
SET STATISTICS IO ON
GO
SET STATISTICS TIME ON
GO

SELECT *
FROM dbo.QueryOptimizer
WHERE [CountryRegionCode] = 'US'
-- US: NONSELECTIVE VALUE, TABLE SCAN, 60000 of 60010 rows
GO
/*
Table 'QueryOptimizer'. Scan count 1, logical reads 231,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/

SELECT *
FROM dbo.QueryOptimizer
WHERE [CountryRegionCode] = 'CA'
-- CA: SELECTIVE VALUE, INDEX SEEK (NonClustered), 10 of 60010 rows
GO
/*
Table 'QueryOptimizer'. Scan count 1, logical reads 12,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/

SET STATISTICS IO OFF
GO
SET STATISTICS TIME OFF
GO
/* STEP 6 – END */
/* TEST 2: WITH NONCLUSTERED INDEX ONLY – LARGE TABLE: SELECTIVITY (END) */

Source code 4: Tests with non-clustered key index only


SSMS result – TEST 1: WITH NON-CLUSTERED INDEX ONLY – SMALL TABLE: SELECTIVITY (US: LOW / CA: HIGH)

Figure 4: SSMS result – TEST 1: WITH NON-CLUSTERED INDEX ONLY – SMALL TABLE: SELECTIVITY (US: LOW / CA: HIGH)


SSMS result – TEST 2: WITH NON-CLUSTERED INDEX ONLY – LARGE TABLE: SELECTIVITY (US: LOW / CA: HIGH)

Figure 4: SSMS result – TEST 2: WITH NON-CLUSTERED INDEX ONLY – LARGE TABLE: SELECTIVITY (US: LOW / CA: HIGH)



/* STEP 1 – BEGIN */
/* PREPARE – Query Hint: OPTION (OPTIMIZE FOR()) (BEGIN) */
USE [Test]
GO

IF EXISTS (SELECT * FROM sys.objects WHERE object_id =
        OBJECT_ID(N'[dbo].[QueryOptimizer]') AND type in (N'U'))
BEGIN
	DROP TABLE [dbo].[QueryOptimizer]
END
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

SET ANSI_PADDING ON
GO

CREATE TABLE [dbo].[QueryOptimizer](
	[TerritoryID] [int] IDENTITY(1,1) NOT NULL,
	[Name] [varchar](50) NOT NULL,
	[CountryRegionCode] [varchar](3) NOT NULL,
 CONSTRAINT [PK_QueryOptimizer] PRIMARY KEY CLUSTERED 
(
	[TerritoryID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [IX_QueryOptimizer_CountryRegionCode]
        ON [dbo].[QueryOptimizer]
(
	[CountryRegionCode] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

SET ANSI_PADDING OFF
GO

SET NOCOUNT ON
GO
/* PREPARE – Query Hint: OPTION (OPTIMIZE FOR()) (END) */


/* CLEAN (BEGIN) */
TRUNCATE TABLE dbo.QueryOptimizer
GO
/* CLEAN (END) */
/* STEP 1 – END */


/* TESTS: WITH Query Hint – OPTION (OPTIMIZE FOR())
   - LARGE TABLE: SELECTIVITY (BEGIN) */
/* STEP 2 – BEGIN */
-- Execution Plan: OFF

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Northwest', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Northeast', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Central', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Southwest', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Southeast', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 1001 -- 1000 rows
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Southeast', 'US')
	SET @rows += 1
END
GO 10 -- 10 * 1000 rows

DECLARE @rows INT = 1
WHILE @rows < 2 -- 1 row
BEGIN
	INSERT INTO dbo.QueryOptimizer (Name, CountryRegionCode)
	VALUES('Canada', 'CA')
	SET @rows += 1
END
GO 10 -- 10 * 1 rows
/* STEP 2 – END */

/* STEP 3 – BEGIN */
-- Execution Plan: ON
SET STATISTICS IO ON
GO
SET STATISTICS TIME ON
GO

DECLARE @WhereValue VARCHAR(50)
SET @WhereValue = 'US'
SELECT *
FROM dbo.QueryOptimizer
WHERE [CountryRegionCode] = @WhereValue
-- US: NONSELECTIVE VALUE, INDEX SCAN (Clustered), 60000 of 60010 rows
GO
/*
Table 'QueryOptimizer'. Scan count 1, logical reads 223,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/

DECLARE @WhereValue VARCHAR(50)
SET @WhereValue = 'CA'
SELECT *
FROM dbo.QueryOptimizer
WHERE [CountryRegionCode] = @WhereValue
-- CA: SELECTIVE VALUE, INDEX SCAN (Clustered), 10 of 60010 rows
GO
/*
Table 'QueryOptimizer'. Scan count 1, logical reads 223,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/

DECLARE @WhereValue VARCHAR(50)
SET @WhereValue = 'US'
SELECT *
FROM dbo.QueryOptimizer
WHERE [CountryRegionCode] = @WhereValue
-- US: NONSELECTIVE VALUE, INDEX SCAN (Clustered), 60000 of 60010 rows
OPTION (OPTIMIZE FOR (@WhereValue = 'US'))
GO
/*
Table 'QueryOptimizer'. Scan count 1, logical reads 223,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/

DECLARE @WhereValue VARCHAR(50)
SET @WhereValue = 'CA'
SELECT *
FROM dbo.QueryOptimizer
WHERE [CountryRegionCode] = @WhereValue
-- CA: SELECTIVE VALUE, INDEX SEEK (NonClustered), 10 of 60010 rows
OPTION (OPTIMIZE FOR (@WhereValue = 'CA'))
GO
/*
Table 'QueryOptimizer'. Scan count 1, logical reads 22,
physical reads 0, read-ahead reads 0, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
*/

SET STATISTICS IO OFF
GO
SET STATISTICS TIME OFF
GO
/* STEP 3 – END */
/* TESTS: WITH Query Hint – OPTION (OPTIMIZE FOR())
   – LARGE TABLE: SELECTIVITY (END) */

Source code 5: Tests with using the query hint – OPTION (OPTIMIZE FOR())


SSMS result – TESTS: WITH Query Hint – OPTION (OPTIMIZE FOR()) – LARGE TABLE: SELECTIVITY (US: LOW / CA: HIGH)

Figure 5: SSMS result – TESTS: WITH Query Hint – OPTION (OPTIMIZE FOR()) – LARGE TABLE: SELECTIVITY (US: LOW / CA: HIGH)

The query optimizer does not know what will be the value of the variable @WhereValue at runtime and does not check statistics, before the query run. The query optimizer will always choose the operator scan.
If the value of the variable @WhereValue will most often be set to the ‘CA’, it would be smarter to optimize the query to use the seek operator instead the scan operator.
It is possible to be done by using of the query hint OPTION (OPTIMIZE FOR()).


Additional references:
Task: to give a basic look at the grouping of rows in SQL Server



Whenever any aggregation function is used, the GROUP BY clause is required. SQL Server 2005 brought the GROUP BY operators like CUBE and ROLLUP, respectively. In addition, SQL Server 2008 brought the GROUP BY operator GROUPING SETS. The GROUPING SETS operator aggregates only specified groups instead of full sets of aggregations by means of CUBE or ROLLUP operators. As an alternative solution for grouping rows in terms of GROUPING SETS operator, one of SET operators was used before SQL Server 2008, the UNION concretely.


USE AdventureWorks2012
GO

SELECT  Name,
        CountryRegionCode,
        SalesLastYear
FROM    Sales.SalesTerritory
GO

Source code 1: Basic example skeleton


SSMS result – Basic example skeleton

Figure 1: SSMS result – Basic example skeleton



USE AdventureWorks2012
GO

SET STATISTICS IO ON
GO

SELECT   Name,
         CountryRegionCode,
         SUM(SalesLastYear) AS TotalOfSalesLastYear
FROM     Sales.SalesTerritory
GROUP BY Name,
         CountryRegionCode

UNION ALL

SELECT   Name,
         NULL,
         SUM(SalesLastYear) AS TotalOfSalesLastYear
FROM     Sales.SalesTerritory
GROUP BY Name

UNION ALL

SELECT   NULL,
         CountryRegionCode,
         SUM(SalesLastYear) AS TotalOfSalesLastYear
FROM     Sales.SalesTerritory
GROUP BY CountryRegionCode

UNION ALL

SELECT   NULL,
         NULL,
         SUM(SalesLastYear) AS TotalOfSalesLastYear
FROM     Sales.SalesTerritory
GO

SET STATISTICS IO OFF
GO

Source code 2: Grouping by means of UNION operator before the GROUPING SETS operator in SQL Server 2008


SSMS result – Grouping by means of UNION operator before the GROUPING SETS operator in SQL Server 2008

Figure 2: SSMS result – Grouping by means of UNION operator before the GROUPING SETS operator in SQL Server 2008



USE AdventureWorks2012
GO

SET STATISTICS IO ON
GO

SELECT   Name,
         CountryRegionCode,
         SUM(SalesLastYear) AS TotalOfSalesLastYear
FROM     Sales.SalesTerritory
GROUP BY
         GROUPING SETS
         (
             (Name, CountryRegionCode),
             (Name),
             (CountryRegionCode),
             ()
         )
ORDER BY GROUPING(Name),
         GROUPING(CountryRegionCode)
GO

SET STATISTICS IO OFF
GO

Source code 3: Grouping by means of GROUPING SETS operator in SQL Server 2008


SSMS result – Grouping by means of GROUPING SETS operator in SQL Server 2008

Figure 3: SSMS result – Grouping by means of GROUPING SETS operator in SQL Server 2008



USE AdventureWorks2012
GO

SET STATISTICS IO ON
GO

SELECT   Name,
         CountryRegionCode,
         SUM(SalesLastYear) AS TotalOfSalesLastYear
FROM     Sales.SalesTerritory
GROUP BY
         CUBE (Name, CountryRegionCode)
ORDER BY GROUPING(Name),
         GROUPING(CountryRegionCode)
GO

SET STATISTICS IO OFF
GO

Source code 4: Grouping by means of CUBE operator in SQL Server 2005


SSMS result – Grouping by means of CUBE operator in SQL Server 2005

Figure 4: SSMS result – Grouping by means of CUBE operator in SQL Server 2005



The result of IO resources consumption using the UNION ALL operator
by returning 27 rows:
Table 'SalesTerritory'. Scan count 4, logical reads 8,
                        physical reads 0, read-ahead reads 0,
                        lob logical reads 0, lob physical reads 0,
                        lob read-ahead reads 0.

The result of IO resources consumption using the GROUPING SETS operator
by returning 27 rows:
Table 'SalesTerritory'. Scan count 2, logical reads 44,
                        physical reads 0, read-ahead reads 0,
                        lob logical reads 0, lob physical reads 0,
                        lob read-ahead reads 0.

The result of IO resources consumption using the CUBE operator
by returning 27 rows:
Table 'SalesTerritory'. Scan count 2, logical reads 44,
                        physical reads 0, read-ahead reads 0,
                        lob logical reads 0, lob physical reads 0,
                        lob read-ahead reads 0.

Source code 5: Results comparison of IO resources consumption across particular SQL Server operators


Additional references: