Tabular Model Partitioning to help Developers

Developing Tabular models in Visual studio can be super painful as the interface gets very “sluggish” as tables get larger. in some cases customers have told me they wait 5-10 minutes a go to say change the name of a column or add a calculation

Another of the annoying things about tabular models is that if you have a partitioned fact it can be only processed on one thread PER TABLE. If we start processing and profile queries to SQL Server we will only see one of them, rather than one per partition or one per core as you get with MOLAP. There is pretty much no way around this ;-(

One solution is to consider using a smaller volume of data for development and then increasing the volume when iterative development has finished and we want more volume.

My favourite approach so far is to use a stored procedure for each partition instead of inline SQL and then use the TABLESAMPLE to reduce the volume. We can then have a setting which we change to set the percentage of facts we want in the model. Just change the setting and reprocess to increase volume. Here is a simple example based on the Adventure Works Reseller Sales.

ALTER PROCEDURE usp_FactResellerSalesPartition
    @MinDateKey int=762
    ,@maxDateKey int=792
AS
BEGIN
    SET NOCOUNT ON;
    declare @sql nvarchar(max)
    declare @PecentSample int =1    -- set this dynamically from say a settings table
        
    set @sql = 'SELECT *    FROM [dbo].[FactResellerSales] '
    if @PecentSample<100 
        set @sql = @sql + 'TABLESAMPLE (' + convert(varchar(3),@PecentSample) + ' PERCENT) REPEATABLE (100)'
    set @sql = @sql + ' WHERE OrderDateKey  BETWEEN  @MinDateKey AND @maxDateKey'

    exec sp_executesql @sql, N'@MinDateKey int, @maxDateKey int',@MinDateKey,@MaxDateKey
END
GO
exec usp_FactResellerSalesPartition

Within visual studio we can then just call the stored procedure within the partition manager dialog box

An added benefit of this approach is that if we need to add a column to the fact table we only need to update the single stored procedure rather than every single hard coded TSQL statement!

Note that we “could” also use a TVF and we actually do this in a lot of customer systems, but for tabular models I choose to use a stored procedure as the TABLESAMPLE function is not supported within a TVF.

As a final note the TABLESAMPLE function needs to have a physical table rather than a view. This is a best practise for most models anyway, but if you are using a view you’ll need to expand the view out into the stored procedure – or use the ETL to create a physical table.

Using this approach I got my processing time in visual studio from 30 minutes to 90 seconds. Much nicer, and iterative development was a dream.

Anyone like this approach or have other suggestions – leave a comment.

Tabular Model Partitioning to help Developers

Like this:

Related

Published by Bob Duffy

Leave a ReplyCancel reply

Share this:

Like this:

Related

Published by Bob Duffy

Leave a ReplyCancel reply

Discover more from Prodata