If you are looking for an expert level partner to help architect and build you Synapse warehouse, then Prodata offer crucial services to massively cut your implementation time, time to market, quality of solution and operational readiness.
- Modern Architecture using Azure, data lake, synapse (aka sqldw).
- DW automation accelerators for Ingress and staging using IP developed and used across numerous enterprise solutions.
- Templates and guidance on data virtualization, transformation layers and implementation.
- DW automation for Kimball Star schema loading
- Out of the box frameworks for logging, lineage, auditing, performance monitoring and error tracking.
- Operational dashboards tracking solution and business process health
When you engage with Prodata, we bring this IP and experience so that you don’t have to build it from scratch.
Data warehouse architecture in Ireland has finally embraced the use of cloud and massively scale out platforms such as Synapse. This has been driven over the last few years by factors such as:
- The movement to Azure based solutions. Pretty much all our new solutions from 2020 onwards have been in Azure in some shape or form, from IaaS to serverless PaaS.
- The emergence of the data lake as a means of ingesting, storing and archiving source data, while providing big data analytics in tools like Databricks direct on top of the data lake.
- The emergence of Synapse data pools (SQLDW) from its early 100TB+ specialist and expensive days into a mainstream data warehouse platform suitable for the more mainstream 0.5-5TB sized solutions common in medium and large Irish companies. With synapse dedicated pools down to about 20 euro per TB storage and often under 750 euro a month compute cost.
- Increased meta-data automation tools and processes, moving development for large teams of human labour creating 1,000s of ETL packages which instantly become legacy status, into a flexible and extensible meta data drive architecture than embraces change
EDW Modelling and Automation
For both data ingress, egress and transformation into star schema we utilise Prodata proven templates and meta data frameworks. These allow you to focus more on design and data modelling and less on ETL, platform and tooling.
Business modelling techniques like the Kimball BUS Matrix are used to model business processes, server as meta data and ultimately automate creation of a star schema using a Prodata framework optimised for the Synapse Platform.
Devops and Release Automation
Using Azure Pipelines and PowerShell automation we bring the latest in release pipelines to your business intelligence project. No longer are releases a dreaded affair requiring much manual labour and human processes. Now, we strive towards the “ten releases a day” mantra with automated build, release and regression test, bug tracking and release management dashboards.
Operational and Business Monitoring
We have templates to help build a custom, operational dashboard for your enterprise data warehouse and business intelligence platform.
Events monitored include:
- SLA Metrics
- Data Availability and Accuracy
- Pipeline failures
- Load failures
- Model refresh failures
- Business data quality
Managed Self Service
Managed self-service involves building a semantic model for your business (ERP or other processes) with data correctly categorised and secured int data models. This is a more enterprise and pragmatic approach to just letting the majority of users loose on raw unsecured and unprocessed data.
Users can then connect to these models to create their own self-service analytics and reporting in power BI. This empowers finance, business analysts and any user who can become familiar in Power BI.
Prodata can help you get the correct level of governance with templates and accelerators for:
- Creating data dictionaries for users.
- Help libraries and guides.
- End user Training.
- Logging and Monitoring analytics.
- Guidance on using Power BI visualizations on your data model.
Synapse and Data Warehousing Blogs
Data Lake Storage to Synapse Analytics using Managed Service Identity (MSI) – COPY INTO and PolyBase
Why to use MSI? Easy to authenticate any Azure Active Directory supported service Provides limited visibility of the credentials More secured than SQL authentication and less susceptible to hacking Configuration steps Create a storage account and enable Hierarchical namespace for … Continue reading Data Lake Storage to Synapse Analytics using Managed Service Identity (MSI) – COPY INTO and PolyBase
Configuration Steps Create a Azure Data Lake Storage Gen 2 storage account, make sure to enable Hierarchical namespace. Read Microsoft KB here. Create a Synpase workspace and select newly created data lake storage as Account name and blob container as … Continue reading Synapse workspace: Load data using Polybase and Managed Identity
In data engineering a common challenge is to securely establish communication between different services. By providing an Azure resource identity in Azure AD and using it to obtain Azure Active Directory (Azure AD) tokens, Managed identities (formerly known as Managed … Continue reading How to configure Managed Identity in Synapse Analytics Workspace?
When working with synapse as a TSQL developer, it does feel like you’ve gone in a time warp back to SQL 7.0 in terms of whats supported and whats not. Luckily most of the TSQL patterns not supported have some … Continue reading Synapse TSQL: variable assignment in SELECT statement cannot be included
Inspired by the DBA tools (https://dbatools.io/) library for automating all things SQL Server DBEngine I created some Powershell Library functions to automate SSIS Build and Deployment. We use this on projects to move SSIS closer to the world of DevOps … Continue reading Automating SSIS Build and Deployment with Powershell
Thanks to everyone who attended my session. Here are some links that might be useful for further reading: – Presentation Deck (attached to this blog) – Microsoft Assessment and Planning Application Toolkit http://www.microsoft.com/downloads/en/details.aspx?FamilyID=67240b76-3148-4e49-943d-4d9ea7f77730&displaylang=en – Sql Consolidation Planning Add-In for Excel … Continue reading SQL Consolidation Planning Session at SqlBits 8
The Microsoft Assessment and Planning Toolkit is a fantastic tool for helping gather resource usage for a SQL consolidation / virtualisation project. http://technet.microsoft.com/en-us/library/bb977556.aspx One thing that is quite basic though is its presentation of disk IOPS (see below). This article … Continue reading Using MAP Tool and Excel to Analyse IO for SQL Consolidation. Part I – basic distribution
Storage area networks are getting larger and more sophisticated. One of the problems I face as SQL consultants is running sqlio on them. One common stumbling block is the size of the cache on some storage area networks can be … Continue reading SQLIO tip #1 Dealing with large SAN cache