Terraform Backend: Role-Based Access Control – Part 1

Chris / March 17, 2021 in 

We previously discussed using Terragrunt to manage your Terraform backend configuration. As a refresher:

  • A backend controls where Terraform’s state is stored
  • Terraform state maps resources created by Terraform to resource definitions in your *.tf files

The next couple of posts will continue exploring backends, this time with a focus on role-based access control (RBAC).

Terraform state is a sensitive resource. It is likely to contain secrets, including passwords and access tokens. Additionally, recovering from a lost state file means either recreating all the infrastructure it contained or spending some quality time running terraform import commands and hand modifying state files.

Chris, BTI360 engineer and author of thirstydeveloper.io, will walk us through how to give our state the protection it deserves using Amazon Web Services.

Controlling State Access with AWS

Our teams generally use the S3 backend, which stores state files as objects within an S3 bucket. When getting started, there are three access levels to consider for your state:

  1. Backend: A dedicated role Terraform will use when accessing and modifying state during operations performed by IAM users or CI/CD.
  2. Developer: Permissions needed for manual modifications/intervention by developers. Restricted from permanently deleting state.1
  3. Administrative: Has full access to state buckets and objects. Access should be highly restricted.

Today’s post will cover the first of the three. We will:

  1. Create a dedicated TerraformBackend IAM role for all state access when running Terraform
  2. Instruct Terraform to use that role for all S3 backend access

The backend role will give us a foundation to build on with subsequent posts. Let’s get started.

A Chicken and Egg Problem

For creating the backend role, the first question we have to answer is: how are we going to deploy it? We’re all about infrastructure-as-code at BTI360, so manually creating it isn’t an option.

It’s awkward to create the backend role with Terraform itself because it introduces a chicken and egg problem. Terraform needs the role to access the backend, and it needs to access the backend to create the role. While it’s technically possible to deploy the role with Terraform, by initially storing the state with the local backend and then later changing the backend to S3, that approach feels a little inelegant.

We consider this backend role part of the “operational infrastructure” required to run Terraform and, as discussed in our previous post, we prefer to manage operational infrastructure with CloudFormation. Today, we’ll add a backend role to the stack we created last time.

If you care to jump to the end, here’s a link to the full CloudFormation template.

Backend Role CloudFormation Template

Previously, we added our state bucket, log bucket, and lock table to a CloudFormation stack. Here’s a reminder of what the CloudFormation template looked like:

---
AWSTemplateFormatVersion: '2010-09-09'
Description: Deploy Terraform operational infrastructure
Metadata:
  AWS::CloudFormation::Interface:
    ParameterGroups:
      - Label:
          default: Terraform State Resources
        Parameters:
          - StateBucketName
          - StateLogBucketName
          - LockTableName
Parameters:
  StateBucketName:
    Type: String
    Description: Name of the S3 bucket for Terraform state
  StateLogBucketName:
    Type: String
    Description: Name of the S3 bucket for Terraform state logs
  LockTableName:
    Type: String
    Description: Name of the Terraform DynamoDB lock table

Resources:
  TerraformStateLogBucket:
    Type: 'AWS::S3::Bucket'
    DeletionPolicy: Retain
    UpdateReplacePolicy: Retain
    Properties:
      BucketName: !Ref StateLogBucketName
      AccessControl: LogDeliveryWrite

  TerraformStateBucket:
    Type: 'AWS::S3::Bucket'
    DeletionPolicy: Retain
    UpdateReplacePolicy: Retain
    Properties:
      BucketName: !Ref StateBucketName
      BucketEncryption:
        ServerSideEncryptionConfiguration:
          - ServerSideEncryptionByDefault:
              SSEAlgorithm: aws:kms
      LoggingConfiguration:
        DestinationBucketName: !Ref StateLogBucketName
        LogFilePrefix: TFStateLogs/
      PublicAccessBlockConfiguration:
        BlockPublicAcls: True
        BlockPublicPolicy: True
        IgnorePublicAcls: True
        RestrictPublicBuckets: True
      VersioningConfiguration:
        Status: Enabled

  TerraformStateLockTable:
    Type: 'AWS::DynamoDB::Table'
    DeletionPolicy: Retain
    UpdateReplacePolicy: Retain
    Properties:
      TableName: !Ref LockTableName
      AttributeDefinitions:
        - AttributeName: LockID
          AttributeType: S
      KeySchema:
        - AttributeName: LockID
          KeyType: HASH
      BillingMode: PAY_PER_REQUEST

The first change we’ll make is to add a resource to create an IAM managed policy for read-write state access. Add the following to the Resources block:

TerraformStateReadWritePolicy:
  Type: 'AWS::IAM::ManagedPolicy'
  Properties:
    ManagedPolicyName: TerraformStateReadWrite
    Path: /terraform/
    Description: Read/write access to Terraform state
    PolicyDocument:
      Version: 2012-10-17
      # Permissions are based on:
      # https://www.terraform.io/docs/backends/types/s3.html#example-configuration
      # https://github.com/gruntwork-io/terragrunt/issues/919
      Statement:
        - Sid: AllowStateBucketList
          Effect: Allow
          Action:
            - 's3:ListBucket'
            - 's3:GetBucketVersioning'
          Resource: !Sub "arn:aws:s3:::${StateBucketName}"
        - Sid: AllowStateReadWrite
          Effect: Allow
          Action:
            - 's3:GetObject'
            - 's3:PutObject'
          Resource: !Sub "arn:aws:s3:::${StateBucketName}/*"
        - Sid: AllowStateLockReadWrite
          Effect: Allow
          Action:
            - 'dynamodb:DescribeTable'
            - 'dynamodb:GetItem'
            - 'dynamodb:PutItem'
            - 'dynamodb:DeleteItem'
          Resource: !Sub "arn:aws:dynamodb:${AWS::Region}:${AWS::AccountId}:table/${LockTableName}"

If you’re using Terragrunt instead of CloudFormation to create your state bucket, log bucket, and lock table, add two additional policy statements:

- Sid: AllowStateBucketCreation
  Effect: Allow
            Action:
              - 's3:GetBucketAcl'
              - 's3:GetBucketLogging'
              - 's3:CreateBucket'
              - 's3:PutBucketPublicAccessBlock'
              - 's3:PutBucketTagging'
              - 's3:PutBucketPolicy'
              - 's3:PutBucketVersioning'
              - 's3:PutEncryptionConfiguration'
              - 's3:PutBucketAcl'
              - 's3:PutBucketLogging'
            Resource:
              - !Sub "arn:aws:s3:::${StateBucketName}"
              - !Sub "arn:aws:s3:::${StateLogBucketName}"
          - Sid: AllowLockTableCreation
            Effect: Allow
            Action:
              - 'dynamodb:CreateTable'
            Resource: !Sub "arn:aws:dynamodb:${AWS::Region}:${AWS::AccountId}:table/${LockTableName}"

This policy grants everything needed by the S3 backend to manage the state. It’s worth noting that Terragrunt requires additional permissions beyond what Terraform specifies.

In particular, this policy does not grant delete access to the Terraform state. As already discussed, losing your state file can create a tremendous amount of pain. Using a versioned S3 bucket helps, but you can still get into trouble if you delete the bucket itself. Since the backend role does not require delete access, it does not get it.

Next, add a second resource to create the backend role and attach the policy to it:

TerraformBackendRole:
  Type: 'AWS::IAM::Role'
  Properties:
    AssumeRolePolicyDocument:
      Version: 2012-10-17
      Statement:
        - Effect: Allow
          Principal:
            AWS: !Ref AWS::AccountId
          Action:
            - 'sts:AssumeRole'
          Condition:
            StringEquals:
              aws:PrincipalType: User
            StringLike:
              'aws:PrincipalTag/Terraformer': '*'
    RoleName: TerraformBackend
    Path: /terraform/
    ManagedPolicyArns:
      - !Ref TerraformStateReadWritePolicy

In the AssumeRolePolicyDocument, I’m specifying who can assume the backend role. There are several ways to go about this, depending on what type of principals are doing the assuming. You could use IAM groups if the principals are IAM users. You could use SAML context keys if dealing with federated users. We will use an attribute-based access control (ABAC) approach to grant access if the principal has a tag of Terraformer. In this case, I’m assuming my principals are IAM users and am restricting access down to that principal type, but this same approach should work for other types as well.

That completes our CloudFormation template. The final result is available here.

Deploy your template as a CloudFormation stack either with the CloudFormation management console or the AWS CLI. Here’s a sample command for the latter:

aws cloudformation deploy \
  --template-file terraform-bootstrap.cf.yml
  --stack-name terraform-bootstrap \
  --capabilities CAPABILITY_NAMED_IAM \
  --parameter-overrides \
    StateBucketName=bti360-terraform-state \
    StateLogBucketName=bti360-terraform-state-logs \
    LockTableName=bti360-terraform-state-locks

 

Configuring Terragrunt

Next, we need to tell Terraform to use the backend role for remote state access. We previously covered how to add a remote_state block to our Terragrunt config. To use our new role, we add the role_arn property to that remote_state block. Here’s an example:

remote_state {
  backend = "s3"
  generate = {
    path      = "backend.tf"
    if_exists = "overwrite"
  }
  config = {
    bucket  = "bti360-terraform-state"
    region  = "us-east-1"
    encrypt = true
    role_arn = "arn:aws:iam::YOUR_ADMIN_ACCOUNT_ID:role/terraform/TerraformBackend"

    key = "${dirname(local.relative_deployment_path)}/${local.stack}.tfstate"

    dynamodb_table            = "bti360-terraform-state-locks"
    accesslogging_bucket_name = "bti360-terraform-state-logs"
  }
}

Re-run terragrunt init for any affected root Terraform modules to switch them over to the new backend configuration, and finally try running terragrunt plan and terragrunt apply. Everything should work.

Conclusion

We now have a dedicated IAM role for the S3 backend to use. However, we still have to define both administrative and developer-level permissions and lock down access to the Terraform state. Be on the lookup for our next post, which will do just that.

In the meantime, if you care to see a fully worked example that incorporates the RBAC concepts introduced today, check out Chris’ terraform-skeleton series on thirstydeveloper.io.

Footnotes

  1. Versioning our state bucket enables reverting to previous state versions, further protecting write access.

 


Interested in Solving Challenging Problems? Work Here!

Are you a software engineer, interested in joining a software company that invests in its teammates and promotes a strong engineering culture? Then you’re in the right place! Check out our current Career Opportunities. We’re always looking for like-minded engineers to join the BTI360 family.


Related Posts:

Previous

Managing Terraform Remote State with CloudFormation

Next

Terraform Backend: Role-Based Access Control – Part 2

Close Form

Enjoy our Blog?

Then stay up-to-date with our latest posts delivered right to your inbox.

  • This field is for validation purposes and should be left unchanged.

Or catch us on social media

Stay in Touch

Whether we’re honing our craft, hanging out with our team, or volunteering in the community, we invite you to keep tabs on us.

  • This field is for validation purposes and should be left unchanged.