Azure locking support implementation.
Digger is an open-source alternative to Terraform Cloud. It makes it easy to run Terraform Plan and Apply in Github Actions.
Digger already had locking support on AWS (via DynamoDB) and on GCP (via Buckets). In our gcp support announcement, Azure support was requested by u/BraakOSRS - and now it’s here! PR just merged yesterday.
Features
- Add ability to
Lock
&Unlock
through Azure Storage Account tables - Support Shared Key authentication
- Support Connection string authentication
- Support Client secret authentication
- Create table if it doesn't already exist
- Normalize lock name to work with
Digger
's format: since Storage Account table doesn't accept#
and/
characters - Provide meaningful errors to user at every step
How to use
There is one mandatory environment variable the user will have to set, in order to use Azure based locks: DIGGER_AZURE_AUTH_METHOD
, which can take one of the three values below:
- SHARED_KEY
- CONNECTION_STRING
- CLIENT_SECRET
Then, depending on the value of DIGGER_AZURE_AUTH_METHOD
, the user will have to set other environment variables.
- SHARED_KEY
DIGGER_AZURE_SA_NAME
: Storage account nameDIGGER_AZURE_SHARED_KEY
: shared key of the storage account
- CONNECTION_STRING
DIGGER_AZURE_CONNECTION_STRING
: connection string
- CLIENT_SECRET
DIGGER_AZURE_TENANT_ID
: tenant id to useDIGGER_AZURE_CLIENT_ID
: client id of the service principalDIGGER_AZURE_CLIENT_SECRET
: secret of the service principalDIGGER_AZURE_SA_NAME
: storage account name
Why tables?
Distributed locking can be implemented in a number of ways. On the surface it seems to make sense to keep implementation consistent across cloud providers. On AWS we are using DynamoDB; so on Azure it must be Cosmo DB right? We however found that simply replicating approach from one cloud provider to another does not make much sense. On GCP we went with Buckets, mainly because it is the simplest and cheapest way to achieve what locks are for because they are strongly consistent on updates.
On Azure, we picked Storage Tables allows us to be scalable and store structured data - but the volume of data is next to nothing so it’s effectively free. We can perform basic locking with resource ID and that’s all we need; similar approach to Buckets in GCP. We can also introduce additional fields if needed that can be stored in the backend; this is more flexible than using storage buckets.
How it works
Digger is written in Go so the locking mechanism on Azure implements the same Lock
interface as its AWS and GCP counterparts:
Lock(lockId int, prNumber string) (bool, error)
Unlock(prNumber string) (bool, error)
GetLock(prNumber string) (*int, error)
Lock()
acquires a lock with lockId for a specific pull request. On Azure this will create a record in the table and store lockID ID as a column. Before creating the record it checks if a lock is already existing for a specific PR. If it is already locked by the current PR then no action is performed. If it is locked by another PR then it will fail.
Unlock()
will release the the lock which was aquired by a specified PR. On Azure this will delete the record from the table if it already exists. If it does not exist no action is performed.
GetLock()
retrieves the lock which was aquired by the PR, if it exists. On Azure this will retrieve the record for prNumber if it exists, otherwise it will return nil.
Motivation behind Digger
We are often asked what benefit does Digger provide compared to simply running terraform plan / apply in an action. The short answer is, if this is enough for your use case then using a specialised tool indeed would be an overkill. But quite often this is not enough.
Terraform being stateful means that each plan / apply run needs to be aware of the state and behave differently. Race conditions against the same state can wreak havoc; but one run at a time repo-wide is impractical too. To make matters worse, code alone does not contain enough information to decide whether to run or to wait - because same terraform code can have multiple "instances" in different environments (just different tfvars). This means that there needs to be some sort of orchestration that is aware of the state.