We all know that Analysis Services uses the windows file system cache when querying, but how does it use the file system write cache when processing dimensions and measure groups?
I couldn’t find much information in the will, so opened up the wonderful process monitor tool from sysinternals to check it out.
As background you’ll need to get some appreciation for the windows API calls “CreateFile” and WriteFile and how they facilitate file caching which is documented here
I looked at three scenarios:
1. ProcessData on a MeasureGroup
2. ProcessFull on a dimension (specifically key stores and attribute stores)
3. Meta data files. As you may know SSAS uses a side by side processing strategy and then updates meta data XML files when the processing is complete so that the new version can now be utilised.
1. Measure Group ProcessData
The screenshot from process monitor shows that SSAS is calling the “WriteFile” API and is not specifying any flags to cause write through. We can also see the “system” process is kicking in shortly after to actually start writing the data to disk from the windows cache.
Interestingly enough, while the IO size is 64k into the windows file system cache we can also see that the lazy writer in the system cache optimises the IO by actually writing to disk in much larger sizes. Potentially up to 2MB in size! (see below)
2. Dimension Processing
For dimension processing, we see that the CreateFile API call does not use any “pass through” flag to write through to the disk, so the windows file system cache is fully utilised (screenshot below) in exactly the same way as measure group processing
3. Meta Data Files
when processing meta data files we can see that Analysis Services forces a flush to the disk subsystem by calling the “FlushBufferFile” API call to force the write to the disk. Assumabley this is to help ensure that server failure does not result in a the switch to the new version of an object not happening. Note that this flushing does not seem to happen when processing measure groups.
So How useful is this information ? Well it does indicate two things:
1. Having a larger block size for processing may be beneficial as although writes are in 64k for measure groups the lazy writer may aggregate this up to 2MB in size. Will try to lab test this …
2. The actual writing to the disk continues in the background by windows cache Lazy Writer for some time after SSAS reports that processing has “completed”. In my simple test I could see the lazy writer still writing to the disk some 4-5 seconds after processing was completed. This begs the question of what happens if the server or disk subsystem fails before the system lazy writer has completed the writes?