Manage multiple Terraform projects in monorepo

Open Table of contents

Intro
Assumptions & requirements
Repository structure
Project development workflow
Module development workflow
Pros and cons
Conclusion

Intro

There are many ways to organize and manage Terraform projects. Some setups require use of additional tools to manage variables, environments/stages, deployments and many more aspects of said projects. By additional I mean anything additional to Terraform itself, git and CI/CD tooling.

Let’s take a look at one possible way to organize and manage a monorepo setup, which will contain multiple projects and Terraform modules, with deployments spanning across multiple targets such as AWS accounts or Azure subscriptions.

The presented setup is geared towards an inhouse setup, a consultancy working with multiple clients may wish to keep Terraform modules separate for cross-utiliziation between different client projects.

Assumptions & requirements

In order to help understand the setup, let me walk through a few assumptions and requirements:

The setup must support management of multiple Terraform projects (1 project = 1 state) and versioned Terraform modules which are used in projects.
Modules must be versioned with semantic versioning.
No additional tools will be used.
Git branches will not be used to manage different deployments.
Because of the previous, git main branch must always contain the source code for all deployed projects, in all live environments.
Deployments to development environments can be done from developer’s machine to enable development of the projects and modules.
Deployments to any other environment must be done through CI/CD. Developers will have only read access to these environments.
GitHub will be used to host the repository, and GitHub Actions will be used to run CI/CD pipelines. This means that the pipeline definitions are managed right alongside with the Terraform code.
Terraform state is managed in shared, but deployment target / environment specific place. In case of AWS in S3 bucket with DynamoDB table for locking, and in case of Azure in Storage Account. Each AWS account / Azure subscription will have its own state store.
Terraform plans must be reviewed and approved before deployments to live environments other than development can be done.

Why git branched are not used to manage deployments? If the project deployments to live environments are governed by git branches, it becomes extremely difficult to see, compare and understand what is deployed and where at any given time.

The main design principle of this approach is to keep things as simple as possible.

Repository structure

Before I will go through the development and deployment workflows, let’s see how the repository is organized:

terraform-monorepo> tree -L 4 -a
.
├── .github
│   ├── actions
│   │   ├── deploy
│   │   │   └── action.yml
│   │   └── prepare-plan
│   │       └── action.yml
│   └── workflows
│       ├── deploy-main.yml
│       └── pr-to-main.yml
├── .gitignore
├── LICENSE
├── README.md
├── docs
│   ├── whats-in-here.txt
│   └── workflow.jpg
├── live
│   ├── development
│   │   ├── api
│   │   │   ├── environment.auto.tfvars
│   │   │   ├── main.tf
│   │   │   ├── outputs.tf
│   │   │   ├── providers.tf
│   │   │   └── variables.tf
│   │   ├── data_processor
│   │   │   ├── user-data.sh
│   │   │   ├── environment.auto.tfvars
│   │   │   ├── main.tf
│   │   │   ├── outputs.tf
│   │   │   └── variables.tf
│   │   └── tf_state
│   │       ├── environment.auto.tfvars
│   │       ├── main.tf
│   │       ├── outputs.tf
│   │       └── variables.tf
│   ├── production
│   │   ├── api
│   │   │   └── [omitted-for-brevity]
│   │   ├── data_processor
│   │   │   └── [omitted-for-brevity]
│   │   └── tf_state
│   │       └── [omitted-for-brevity]
│   └── staging
│       ├── api
│       │   └── [omitted-for-brevity]
│       ├── data_processor
│       │   └── [omitted-for-brevity]
│       └── tf_state
│           └── [omitted-for-brevity]
└── modules
    ├── api_layer
    │   ├── v1.0.0
    │   │   └── [omitted-for-brevity]
    │   └── v1.1.0
    │       ├── main.tf
    │       ├── outputs.tf
    │       └── variables.tf
    ├── processing_cluster
    │   ├── v1.0.0
    │   │   └── [omitted-for-brevity]
    │   ├── v1.0.1
    │   │   └── [omitted-for-brevity]
    │   └── v1.2.0
    │       ├── main.tf
    │       ├── outputs.tf
    │       └── variables.tf
    └── vpc
        └── v1.0.0
            ├── main.tf
            ├── outputs.tf
            └── variables.tf

Whoa, that’s a lot of directories! Let’s break it down bit by bit, and start with root-level directories:

Directory .github will contain the CI/CD workflow and action definitions.
Directory docs will contain all assets, which are not part of the actual source code. Things like images for documentation etc. will go in here.
Directory live will contain each environment/stage as subdirectory.
Directory modules will contain all custom Terraform modules.

Let’s look into modules next:

Each module will have its own directory.
Modules are versioned with subdirectories. In a project, the correct version of a module is imported with source = "../../../modules/vpc/v1.0.0".
As new version of a module is needed, it will get a new subdirectory with name following semantic versioning.

And finally the live directory:

Individual Terraform projects are here as subdirectories, under respective environment/stage directories.
Each project will be deployed separately, and will have its own Terraform state.
Terraform plan and apply will be run against each project-specific subdirectory, e.g. development/api or production/data_processor in the above example.
Each stage (development) of each solution (api) — e.g. development/api — is handled as individual project. This choice will potentially lead to code duplication between development, staging and production environments, but this can be somewhat tackled with smart use of modules. Some duplication will obviously remain, but this is the price we must pay in order to have clear visibility to existing and deployed code without getting lost within git branches or Terraform workspaces.

Project development workflow

How does the development and deployment workflow look? The idea is to keep things as simple as possible, and as stated earlier, all live environments must be in main branch after each development cycle.

project development workflow diagram

Develop project locally in a stage and project specific subdirectory, in issue-nn / dev branch.
While developing, run terraform plan and terraform apply against the dev environment, since that will be the only one you as a developer will have write access.
When you are confident, that your solution works, add/update GitHub Actions workflow definition. The definition file is shared between all projects. The code duplication in workflows can be minimized by using composite actions to share CI/CD code between projects.
Commit, and push your changes to GitHub to your development branch, e.g. issue-nn or dev. Create a pull request to main branch.
This PR will trigger CI pipeline pr-to-main.yml which will validate the source code and run terraform plan for all projects. The plans’ outputs are written as comments to the PR. Again, changes should be found only for the projects, that you changed! You did not change dev, test and prod at the same time, did you?
Review the plans, and if all looks good, merge the changes to branch main. If they do not look as expected, iterate back to step 1.
When the changes are merged to branch main, CD pipeline deploy-main.yml will run terraform apply for all projects.
After the changes are in branch main and successfully deployed, delete the development branch.

Module development workflow

Module development workflow is very similar compared to project development.

module developlemen workflow diagram

Developer will create a new subdirectory for the new module version and a new development project where the module can be tested while in development. This dev project is deleted, when the module version is ready.
While developing, run terraform plan and terraform apply against the dev environment, since that will be the only one you as a developer will have write access.
When you are confident, that your module works, commit the changes and push your changes to GitHub to your development branch, e.g. issue-nn or dev. Create a pull request to main branch.
Even though CI pipeline will run, no changes should be found at this stage, since only module code was changed.
Review the PR comments, that this indeed is the case.
Since the same CD pipeline will be run as with project deployments, terraform apply will be run for all projects, but without any changes.
After the changes are in branch main the development branch can be deleted.

Pros and cons

There are many benefits to the presented setup:

Single source of truth. All source code is managed in the same place, and visible in developer’s local filesystem.
Comparing different projects and stages is trivially easy.
All available modules and their versions are easily discoverable without external documentation and indexes.
Source code validation and linting can be done in one place for example with pre-commit hooks.
No need to setup git submodules.
No need for cross-repository access in CI/CD.
Deployment configuration can be easily shared.
Since all CI/CD pipelines are run from the same repository, each target (AWS account, Azure subscription, etc.) requires only one setup per target for authentication and authorization.

There are also some things, which are less than optimal:

For a new developer, who is just starting with one project, the amount of code can feel overwhelming.
Modules cannot be versioned with git tags, because they are part of the monorepo which is versioned as a whole.
When compared in one project/deployment level, CI/CD pipelines will take longer to run, even when changes are done in one project only. Parallelism and smart caching can help some, but maybe not huge amounts.
Since all projects are within same repository and deployment configuration, risk of mixing things up is greater than with individual project repositories.

Conclusion

While a monorepo setup is not perfect, in my opinion it offers far more benefits over individual project repositories. As a new developer a monorepo can feel overwhelming, but this can be helped a lot by focusing on one project (subdirectory) at a time, ignoring rest of the codebase.