SQL Server - ColumnStore Index - TechDB

Latest

All about Database Programming, Performance Tuning and Best Practices.

BANNER 728X90

Monday, 31 December 2018

SQL Server - ColumnStore Index



Introduction

Microsoft introduced a new type of nonclustered index called xVelocity columnstore... index in the SQL Server database engine. It is based on VertiPaq in-memory data compression technology. The xVelocity memory optimized columnstore indexes are one of the most significant performance and scalability enhancements in SQL Server 2012. The xVelocity columnstore indexes speed up the processing time for common data warehousing queries -- particularly in situations where very large quantities of data have to be aggregated and accessed quickly. According to Microsoft, under certain conditions, the xVelocity columnstore index provides queries with speed improvements of 4X, 10X or even 100X.
SQL Server columnstore index technology is especially appropriate for typical data warehousing data sets. columnstore indexes can transform the data warehousing experience for users by enabling faster performance for common data warehousing queries such as filtering, aggregating, grouping, and star-join queries.

Before designing, implementing or managing a columnstore index, it is beneficial to understand how these indexes work and how the data is stored in a columnstore index.

Unlike traditional tables (heaps) and indexes (B-trees), where data is stored and grouped in a row-based fashion, xVelocity columnstore indexes are nonclustered indexes that group and store data column-wise instead of row-wise and then join the columns to complete the index. The columnstore index stores each column in a separate set of disk pages, rather than storing multiple rows per page, which is the traditional storage format.

For example, consider a table containing Customer data, as illustrated below:


In traditional indexes (B-trees), the data of a customer table is organized in one or more data pages, as illustrated in below:


In a columnstore index, customer table data is stored column-wise in a separate set of disk pages, as illustrated in below image:



Lets check how columnstore index improve query performance with an example in few steps.



1- STEP 1 - First check the table and records in the table.



STEP 2 - Create a traditional index on FirstName, LastName and BirthDate.
-- Create a Non clustered index

CREATE NONCLUSTERED INDEX IDX_NonClusteredIndex ON [dbo].[CustCopy] (FirstName,LastName,BirthDate);

GO


STEP 3 - Check the pages distribution and paste the image below.


STEP 4 - RUN and query Select FirstName, LastName, BirthDate from CustCopy ..and check the IO and execution plan an paste the image below..



STEP 5 - Instead of droping and recreating columnstore index on same table, I make a copy of the same table with different name called “Customer”. And this time use ColumnStore index instead of traditional index in this table “customer”.

-- CREATE Non clustered columnstore index

CREATE NONCLUSTERED COLUMNSTORE INDEX [IDX_NonClusteredIndexColumnStore] ON [dbo].[Customer] (FirstName,LastName,BirthDate);


STEP 6 - Check the pages distribution and paste the image.


STEP 7 - RUN and query Select FirstName,LastName,BirthDate from Customer.


STEP 8 - Now execute both and check performance.




Thanks for reading.

No comments:

Post a Comment