Do we need to maintain Columnstores in SQL Pools ?

TLDR

The short answer is YES – if you have >= 60 million rows, then a poorly maintained column store was 100% slower due to trim fragmentation and 64% slower due to the delta row store with some updates in our tests.

Obviously this is just a small test on only 60 million rows and a DW100C. YMMV on larger tables and larger SKUs. Let me know what impact youve seen by applying maintenance!

The golden lining is that the tuple mover will help somewhat if you get a lot of churn by auto merging some small rowgroups or row groups with > 10% deletes.

In later posts I’ll demonstrate exaclty how to do this maintenance our Synapse Maintenance Solution below
https://github.com/ProdataSQL/SynapseTools

The Longer Version

For best query and storage performance a row group within a ColumnStore will always aim to store 1,048,576 rows (1024*1024).  However, if this number if lower, then the row group is sub-optimal and query performance and storage space will be affected. Don’t forget that we have 60 distributions in a SqlPool, so an insert must ideally be over 60 million for each row group to reach optimal size.

This is covered in the MS docs here

https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-memory-optimizations-for-columnstore-compression

The question here is is the juice worth the squeeze ? on maintenance
eg if we go to the bother of implementing daily/wekly column store maintenance what might the benefits be.

To demonsrate this I created an identical table on a DW100x with a perfect columnstore and a sub optimal column store and tested the performance a simple select statement. Both tables have about 60 million rows with one having omly 60 row groups and one having almost 600 row groups for the exact same data content.

The table with the larger number of column stores took 59 seconds as opposed to 31 seconds. Almost 100% slower.

As a caveat – If your data volume is really small, the buffer pool, storage caching and result set caching will help hide this difference, especially if the column stores fit into primary ram or even better L2 cache.

How to create a fragmented columnstore

If a we create a table with about 60 million rows but bulk load it in 10 chunks, then we will have 10x more row groups than we really need (In theory). Code below

CREATE TABLE [dbo].[FactFinance100m] WITH (
	DISTRIBUTION = ROUND_ROBIN, CLUSTERED COLUMNSTORE INDEX 
) AS
SELECT TOP 6291456 * FROM [dbo].[FactFinance1b]
INSERT INTO FactFinance100m ([AccountKey], [ScenarioKey], [DepartmentGroupKey], [DateKey], [OrganizationKey], [Amount], [Date], [LineageKey])

SELECT TOP 6291456  [AccountKey], [ScenarioKey], [DepartmentGroupKey], [DateKey], [OrganizationKey], [Amount], [Date], [LineageKey] FROM [dbo].[FactFinance1b]
GO 9

We can see that instead of the ideal 60 row groups (1 per distribution) we actually have 582. Note: not 600 as expected due to some mysterious AUTO_MERGE of the row stores by the tuple mover.

select * From [dbo].[vCS_rg_physical_stats] WHERE logical_table_name=’FactFinance100m’

(582 rows affected)

We can see that the trim reason is mostly BULKLOAD. A common reason for column stores to be closed early as the load had not enough rows. The BULKLOAD will create a row group as small as 100k in some cases.

How to create a perfect Columstore

If we create the entire columnstore at once (and no trim issues), then we will get a near perfect columnstore table like below

CREATE TABLE [dbo].[FactFinance100m2] WITH (
	DISTRIBUTION = ROUND_ROBIN, CLUSTERED COLUMNSTORE INDEX 
) AS
SELECT TOP 62914560 * FROM [dbo].[FactFinance1b]

With this, we would hope for excatly 60 rowgroups with the same number of rows each. However its not quite that perfect as the round robin distribution doesnt quite evenly distribute so we have 64 row groups. Close to the perfect 60

Will small row groups merge themselves ?

In some cases a SqlPool used the tuple mover to AUTO_MERGE two smaller row groups and mark the old row group as TOMBSTONE to be deleted by a ghost cleaup processs. We can see below that rather than create 600 row groups we actually had 582.

I dont have a lot of details as to how the AUTO_MERGE works in Syanpse SQL Pools, but it certainly will not do 100% of the required maintenance to keep optimal. We can see the few merged row groups below though.

SELECT * From [dbo].[vCS_rg_physical_stats] 
 WHERE logical_table_name='FactFinance100m'and (trim_reason_desc <>'BULKLOAD' or trim_reason_desc is null)

Test2 – Impact of inserted/deleted rows in the delta store

For this test, I took the “perfect” columnstore with the smaller number of row groups and issues an update to create 6 million rows out of the 60 million as inserts/deletes within a row/delta store as opposed to the column store and re-ran the same query for comparison.

With just under 10% of the rows updated we had query time from 31 to 51 seconds. So 64% slower.

The query used is below to create some churn in terms of deleted and inserted rows.

UPDATE FactFinance100mv2
SET Amount=Amount+1 WHERE 
DateKey > 20130000 and DateKey < 20140000

We can see the count of deleted / inserted rows using the sample View from Prodata Synapse Tools. https://github.com/ProdataSQL/SynapseTools/blob/main/SqlPools/Maintenance/vColumnStoreStats.sql

select * from vColumnStoreStats 
WHERE table_name='FactFinance100mv2'

Will Deletes fix themselves ?

If > 10% of a row group is deleted, then the tuple move may do an AUTO_MERGE to create a new row group and then remove the need to store the deletes. This is assuming the Synapse SqlPool behaviour is the same as SQL2016+ tuple mover as below. https://techcommunity.microsoft.com/t5/datacat/sql-2016-columnstore-row-group-merge-policy-and-index/ba-p/305255

If AUTO_MERGE does not remove the deletes, then the only option in an index rebuild, which can be quite a large operation.

Leave a Reply