Building for Scale with Terraform and CI/CD
A Solution Architect’s Notebook
This series follows my journey in designing and building a complete, real time data platform on Databricks. It begins as a personal project and evolves as a professional study in architecture, automation, and governance.
Each article focuses on one stage of the build — from setting up infrastructure with Terraform and GitHub Actions to developing streaming pipelines, enforcing quality, and applying Unity Catalog for data governance.
The intent is not to chase speed but to practise discipline, clarity, and design thinking — the habits that define a Solution Architect.
In the first article “Building a Real Time Flood Warning Pipeline on Databricks, an introduction” I explained why I wanted to build something real, a complete data platform that handles live flood warnings from the UK Government. This is the next stage, where ideas turn into infrastructure.
This part of the journey is about creating the foundation that everything else will rest on. I wanted to stand up a Databricks workspace on Azure, automate it properly, and make sure the entire setup could be destroyed and rebuilt with a single command. In other words, start behaving like a solution architect would in a real project.
Starting from a clean slate
I began with an empty Azure subscription and a blank GitHub repository. There was something satisfying about that. It meant that every line of code I wrote would define something that didn’t exist before.
Terraform became my starting point. The goal was to codify everything: the resource group, the Databricks workspace, the tags, the structure. No clicking around in the portal, no hidden configuration, just declarative code.
I like the feeling of predictability that comes with Terraform. When you run apply, you know exactly what will happen. That predictability is what turns good engineering into architecture.
Defining the environment
The configuration started small and grew carefully. I had a main file for resources, a provider file for configuration, and separate variable and output files. It looked simple, but it had to be exact.
The first challenge came when I needed both the Azure provider to create the workspace and the Databricks provider to manage what happens inside it. It took a few attempts to get those working together without stepping on each other’s toes.
Once it finally applied cleanly, the feeling was immediate relief and quiet satisfaction. Terraform had built the first version of the environment exactly as described.
Automating everything
The next step was automation. I didn’t want to rely on stored client secrets or manual logins. I wanted the workflow itself to authenticate securely every time it ran.
That’s where OpenID Connect came in. GitHub Actions can request a temporary token from Azure, meaning no secrets stored anywhere. It took a bit of reading and a few authentication errors before it clicked. The key was making sure the subject string in Azure matched the environment name in GitHub exactly.
The first time the workflow ran cleanly and I saw “Federated token exchange successful” in the logs, it felt like progress you could trust. The system was now authenticating on its own terms.
Managing Terraform state
One of the next architectural decisions was where to store the Terraform state file. Keeping it local would have worked for a single user, but not for a proper CI/CD setup. I wanted everything shared between local development and GitHub Actions, so I set up a remote backend in Azure Blob Storage.
A simple backend block connected everything together.
Terraform now read from and wrote to a single shared state, which meant that any change made locally or through the workflow would always see the same reality. That one change made the whole setup feel professional.
The first green run
When the pipeline ran end to end for the first time, the logs told the story:
Run azure/login@v2
Federated token exchange successful
Run terraform init
Run terraform apply
No changes. Your infrastructure matches the configuration.
That line, “no changes,” meant everything was aligned. The state was consistent, the infrastructure existed exactly as defined, and the workflow had handled it automatically.
It’s a small thing, but seeing a system behave predictably is one of the quiet joys of engineering.
Lessons from the setup
There were a few lessons that I’ll carry into any future design.
Automate from the start. Waiting until the end just means rewriting what you’ve already done by hand.
Avoid secrets wherever possible. OIDC works beautifully once configured, and it aligns with how modern zero trust systems should behave.
Keep Terraform state remote. It prevents drift and creates a single source of truth for every deployment.
And finally, tag everything. Even personal projects deserve governance. It builds the habit of thinking about ownership and accountability early.
What comes next
With the infrastructure and CI/CD layer now solid, the next stage shifts from platform setup to integration. Before ingesting a single record, I needed a way to develop locally while running workloads directly on Databricks compute.
That’s where Databricks Connect comes in — the bridge that links my local environment to the cloud workspace, making development faster, repeatable, and scalable.
Next article: Connecting Local Development to Databricks Compute.
Before continuing the build, there was a brief detour into the world of identity, Terraform state locks, and authentication negotiation between Azure and Databricks. It became one of those engineering days where progress feels less like a straight line and more like a tightening spiral, each loop bringing a little more clarity. I documented that episode in A Day in the Trenches of Identity and Automation, because those moments deserve to be understood just as much as the clean green ticks in a pipeline. Build systems long enough and you realise stability is not a default state; it is earned, tested, and reinforced through experience.

