Giving Pulumi a Spin
26/02/2023
I had a need to spin up some multi-tenant Azure infrastructure recently for a proof of concept. This required similar but differing deployments, with frequently changing infrastructure components, based on a self-service model. A goal was to have a central solution, deploying to multiple tenants. This was an interesting design challenge!
My requirements were:
- Create a standard set of infrastructure that didn't vary between deployments
- Add multiple specialised resources that can vary between deployments with differing configurations
- The deployment process should handle the removal or addition of the variable resources if they are not present compared to the particular deployment.
Most of the time when using infrastructure-as-code (IAC) techniques to build infrastructure, the infrastructure deployment artefacts are often kept in source control and deployment can use continuous delivery (CD) techniques to deploy the infrastructure. In these scenarios the infrastructure is relatively static and not deployed that frequently. In my scenario, the deployment could happen many times an hour during testing. In addition, the multi-tenant nature made it difficult to automate the deployment method, as each deployment needed to be a different tenant ID. I needed a data driven approach to generating the deployment artefacts.
I was struggling to think of a good way to do this using Azure DevOps and Bicep or ARM templates. Using text templating seemed like a potential option (e.g. Liquid templates), but seemed quite brittle. The flow to the backend would be feasible as part of a deployment pipeline, but updating the data source would likely be fairly manual.
I wanted a more automated and simple process. I had good sucess doing similar work previously using Farmer (a F# system that builds out ARM templates using F# computation expressions), but it does require teaching people to use F#.
I remembered a couple of articles I'd read recently about Pulumi and thought that this might be a good fit due to it's use of code to define resources; this would give me a chance to handle the deployment differently based on some incoming parameters.
Getting started with Pulumi and the build
I started with installing the CLI using the instructions, then set about building out my infrastructure as a class in C#, following the tutorial. To build out a Pulumi deployment you create a class in C#, inheriting from the Stack class and build out the deployment as part of the constructor.
One of the great things about using a programmatic deployment model is that you can create a different deployment using external inputs. I used this to build a stack that contained a consistently named set of base resources and then a set of resources created on input data held in a different data source. After building this out I had my target deployment. I was able to put the details of the variable resources into the data store and then run the CLI to create those resources on demand.
Deploy on demand
The next part of my build needed to be running the deployment on demand. My normal preference would be to run as much as possible from CI/CD, so I investigated using Azure DevOps to perform the build (perhaps initiated from a webhook), but because I wanted the deploy to be self-service, I decided against this approach. A CI/CD initiated build can be slow to start due to the need to acquire a worker and deploy a container. There would be ways to do it by using a self-hosted runner, but this would be quite expensive. I also ruled out using Azure Functions for this as it was possible that the deploy would not finish in the maximum function duration (maximum 10 minutes).
One thing that I've done successfully in the past is to deploy a .NET worker service and decouple the API written in Azure Functions from the back-end service, communicating via a queue. This seemed like a promising approach. This has become much easier since .NET Core 3.1 came with a worker service template that previously needed a bit of self-assembly, which I'd previously used for other solutions. Microsoft also recently released the Azure Container Apps service - deploying one of the workers there allows the code to run serverlessly, spinning up when a queue message enters the queue. This functionality is enabled by the use of Keda scaling provided by the container apps service.
Using the automation API
Switching to a worker meant running the Pulumi deployment from code rather than by using the Pulumi CLI. Pulumi provide an "Automation API" to do this, which was relatively straightforward to get running. I generated an API token and then accessed this using .NET configuration injected into my worker class. I followed the Inline Program example from the Automation API examples. Once this was in place, I integrated waiting and pulling messages from an Azure storage queue into the worker and used this and querying my data store to build out the deployment resources. Once done, I built out a simple endpoint in my Azure Functions project to drop messages into the queue. I then built out a Dockerfile. In order to get the worker to run I had to install the Pulumi CLI as part of the Docker image. I tested this locally and then and pushed the container app to Azure.
The docker additions looks like:
# any setup here
RUN apt-get update && apt-get install -y \
curl
RUN curl -fsSL https://get.pulumi.com | sh
ENV PATH=/root/.pulumi/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
# your entrypoint here
The final architecture looks like this:
I was pleased with how effective Pulumi was to create this integration and would look forward to using it again in the future. Using the background worker in conjunction with the Function App is an useful pattern to create decoupled services and container apps makes this pattern really easy to adopt. Both services allow 'scale to zero', so this type of application can be run very cost-effectively.