Advantages and Limitations of Terragrunt-Managed Backends

Chris / February 16, 2021 in 

Terraform uses state files to track the resources it creates back to resource definitions in your *.tf files. Each root module has a backend configuration that determines how its state is stored. Terraform uses a local backend by default, storing state on the local machine it is run on, but you’ll generally want a remote backend for production use that includes versioning and backups of your state files. Many remote backends also sequence changes to the state file through a locking mechanism, which avoids collisions when multiple developers modify the same root module.

Terraform lacks an easy means of sharing backend configurations across multiple root modules, however. As discussed here, Terragrunt fills this gap, making remote state much easier to work with, but there are also some limitations to be aware of.

Chris, BTI360 engineer and author of thirstydeveloper.io covers the advantages and limitations of Terragrunt-managed backends in today’s post.

Remote State with Terragrunt and AWS

Terraform has you specify backend configurations in your *.tf files. With Terragrunt, you instead provide a remote_state block within one of its configuration files (e.g., terragrunt.hcl), and Terragrunt then generates the necessary Terraform backend configuration based on what you specify. You’ll often place the remote_state block inside a parent config, which you then include in each root module’s terragrunt.hcl so you don’t need to duplicate the configuration for each root module.

BTI360 teams use Amazon Web Services heavily. As such, we typically use the S3 backend for storing our state. Here’s an example configuring the S3 backend inside our root.hcl parent configuration file:1

locals {
  # Get the path of the root.hcl file
  root_deployments_dir = get_parent_terragrunt_dir()

  # Get the path of the terragrunt.hcl file including the root.hcl file
  # relative to the root.hcl file
  relative_deployment_path = path_relative_to_include()

  # We call root modules stacks. Get the name of the stack from the path.
  stack = reverse(local.deployment_path_components)[0]
}

# For each stack, generate a backend.tf file with the appropriate backend config
# This stores the state for all our stacks inside the same bucket, with a prefix
# based on the path of the stack in our infrastructure repository.
remote_state {
  backend = "s3"
  generate = {
    path      = "backend.tf"
    if_exists = "overwrite"
  }
  config = {
    bucket  = "terraform-state"
    region  = "us-east-1"
    encrypt = true

    key = "${dirname(local.relative_deployment_path)}/${local.stack}.tfstate"

    dynamodb_table            = "terraform-state-locks"
    accesslogging_bucket_name = "terraform-state-logs"
  }
}

For details on how this works, see this post. Summarized, we are using:

  • A single terraform-state S3 bucket for all deployment state files
  • A unique key within that bucket for each root module, based on the relative path from the root.hcl
  • A terraform-state-locks DynamoDB table to sequence state file changes
  • A terraform-state-logs S3 bucket for logging all S3 access requests to our state bucket

Advantages of Terragrunt-Managed Backends

Terragrunt does a ton of work for you when using the remote_state configuration block with the S3 backend. It will:

  1. Automatically create the state and log buckets if they don’t exist
  2. Automatically create the lock table if it does not exist
  3. Configure the state bucket (but not the logs bucket) with:
    1. Versioning and encryption enabled
    2. Public access disabled
    3. TLS-only access enforced
  4. Enable encryption on the lock table

This makes it easy for teams to get started using remote state with sensible security defaults.

Limitations of Terragrunt-Managed Backends

While extremely helpful, there are some limitations to Terragrunt’s remote_state configuration block to be aware of.

1. Terragrunt lacks security defaults on the log bucket
If Terragrunt auto-creates the log bucket, it does not appear to enable encryption or explicitly block public access. For this reason, we strongly recommend you consider self-management of the log bucket rather than letting Terragrunt create it for you.

2. Disabling auto-creation of the state bucket and lock table is broken
If you have a single team owning the infrastructure, auto-creation is likely safe enough. When you start distributing ownership, with different teams owning different pieces, it can be helpful to centrally control where state files are stored for auditing and compliance purposes. This means avoiding a situation where a team accidentally creates a new state bucket or lock table the larger organization is unaware of.

Terragrunt has a disable_init attribute on the remote_state block, which will prevent auto-creation but, as described by this open issue, also completely disables backend initialization, such that you can’t run any Terraform commands.

Our teams that want to avoid auto-creation currently create the buckets and lock-tables outside of Terragrunt and use IAM permissions to restrict Terragrunt to only working with permitted remote state resources.

3. Terragrunt doesn’t offer full control over the credentials used to access the Terraform state
The remote_state block has a few fields for specifying a role_arn or AWS profile to use for remote state access, but you can’t control things like IAM session tags, transitive tag usage, or assume role policies. If you need access to these features, you’ll have to avoid terragrunt’s remote_state block.

4. Terragrunt doesn’t offer full control over all fields on the buckets and table
While terragrunt applies sensible security defaults, you can’t control everything using its remote_state block. For example, you’ll need to self-manage if you need to specify specific KMS keys for encryption of the bucket or table. Similarly, if you want to set up replication of your Terraform state bucket, self-management is the way to go.

Conclusion

Today’s post hopefully has given your team a sense of what Terragrunt can and cannot do when managing your Terraform backend. For more information on this approach, including a fully worked example, see part 3 of Chris’ terraform-skeleton series on thirstydeveloper.io.

Footnotes

  1. See here for why we call this file root.hcl

 


Interested in Solving Challenging Problems? Work Here!

Are you a software engineer, interested in joining a software company that invests in its teammates and promotes a strong engineering culture? Then you’re in the right place! Check out our current Career Opportunities. We’re always looking for like-minded engineers to join the BTI360 family.


Related Posts:

Previous

Creating a Terraform Variable Hierarchy with Terragrunt

Next

Managing Terraform Remote State with CloudFormation

Close Form

Enjoy our Blog?

Then stay up-to-date with our latest posts delivered right to your inbox.

  • This field is for validation purposes and should be left unchanged.

Or catch us on social media

Stay in Touch

Whether we’re honing our craft, hanging out with our team, or volunteering in the community, we invite you to keep tabs on us.

  • This field is for validation purposes and should be left unchanged.