Site icon Prodata

The need for DW Automation (1/3)

Many folks build star schema data warehouses and the supporting ecosystem of Semantic Models, Business Intelligence and/or ML Analytics. This is a quick note on how to assess how you are with automation, regardless of what tools you are using: SSIS,ADF, SQL, data bricks, Synapse, or non Microsoft tools.

I recently ran a twitter thread and the #1 feature for a DW was agility. Eg being able to quickly get data from source into the star schema or at least in a curated format in data lake speak. I would add to that extenisbilty – there is no point in building someting quickly if you cant change it quickly!

How do smaller Teams Achieve Agility but larger new teams dont ?

How do consultancy/partner houses who develop DWs for a living achive this ? well there two major differences between an experienced team who produces quick succcessful projects and larger companies who hire lots of people but flounder at making things quicky and efficiency without them becomes instant “legacy status”

  1. They have a good proven architecture and methodology. Discussions on tools and platforms is a 5 minute conversation as they team has done this 10-100 times before and has T shirt sizes. All team members know naming convention, tools, how to code, how to use tools, how requirements are documented, source code, DevOps, and what to do without much discussion. Some team members may come and go, but a senior architect and core members will persist and on board new members.
  2. They re-use IP between projects to accelerate the process. Lets call this DW Automation tools, although in practise a lot of this could be cut and paste from one customer to another at early maturity. Even that first step gives them a massive boost over the greenfields team.

What scope does DW Automation apply to ?

One thing to clarify is the SCOPE of DW automation tools, for someone just starting out the focus is always injest or extract. Examples are:

However, while this is very much the best place to start, it is very much just the “easy pickings” and start of the journey. The more complex parts are:

Eliminating piping and Fat ?

Anyone who hears me tak about DW architecture will hear me talk a bit about three components of data architecture Fat, piping and business logic

Clearly we want to use a framework which has a standard fat, automates piping and allows the data engineers to plug in business logic (without writing any fat).

Some commerical tools attempt to use low code or workbenches to fully automate business logic. I’m not so sure on that myself as they end up becoming a development envionment rather than enhancing it and can fall behind the tradiitonal development envirtonment – be that ADF or say databricks.

What do these DWA Frameworks look like ?

a DWA Framework can take many forms

As an example

Whats Next

In the next blog I wll look at how to measure a maturity model of where you are with DW Automation, versus where you could be.

Its not necassarily a good move to go up the maturity level. Eg moving from Loose IP to a full commercial product could be the kiss of death for a consultancy practise IMO. Service and products are difficult to mix (or they have been for me!).

Exit mobile version