Business context
Enterprise organisation with large on-prem footprint (high performance compute clusters, ad-hoc compute systems, centralized storage) and several independent teams.
Setting up a scalable Azure Data Platform with the following goals:
centralised security policies
project governance
data discoverability
Scope & objectives
Centralised platform governance
Implementing Data Mesh principles
Scheduling / Orchestration
Everything as code (infrastructure, configuration, access policies)
Templates/Blueprints for building data products
APIs serving data
Internal reporting (dashboards)
Key results
First steps with a data catalog (Azure Purview)
Automating infrastructure provisioning (Azure Devops + Terraform)
Automating project deployments (Azure Devops)
Data Product Orchestration and Governance (Datafy)
Centralized API management (Azure API Manager)
Impact
With the Data Catalog we set up the foundations for a self-service Data Platform
The everything-as-code approach increased reliability of the Data Platform
Speed up data product development and reduce maintenance cost