Terraform data external might not act as you expect

2 min readDec 27, 2024

Background

I am currently looking to upload a lambda function to AWS.

It will trigger the build bash script, and generate a JSON file with the following.

{
  "path": "my_path",
  "zip_file": "/tmp/my-lambda-function.zip"
}

Problem

We are using the following terraform resources to invoke the customized build script (build.bash) to build the zip file.

data "external" "build_zip" {
    program = ["bash", "-c", "build.bash"]
}

resource "aws_lambda_function" "test_lambda" {
  ...
  source_code_hash = filesha256(data.external.build_zip.result.zip_file)
  ...
}

Expected

We expected a new lambda function should be deployed to remote for each trigger.

Actual

The terraform does not ensure the lambda function will be deployed every time and it becomes a random behavior, it might or might not be deployed to remote unless we check the code from the AWS console.

Possible reason

data "external" "build_zip" { <-- this might only trigger once
    program = ["bash", "-c", "build.bash"]
    # following statement does not support 
    triggers = {
        always_run = timestamp() <-- ensure it will trigger each time
    }
}

Resolution

resource "null_resource" "build_zip" {
    provisioner "local-exec" {
        command = "build.bash""
        interpreter = ["/bin/bash", "-c"]
    }

    triggers = {
        always_run = timestamp()
    }
}

data "local_file" "zip_file_location" {
  filename = "/tmp/my-lambda-function.zip"
  depends_on = [ null_resource.build_zip ]
}

resource "aws_lambda_function" "test_lambda" {
  ...
  source_code_hash = filesha256(data.local_file.zip_file_location)
  ...
  depends_on = [ data.local_file.zip_file_location ]
}

The above code can ensure build script will be triggered every time.

Better resolution ( only update when the code changes)

used sha1 to watch a directory

resource "null_resource" "build_zip" {
    provisioner "local-exec" {
        command = "build.bash""
        interpreter = ["/bin/bash", "-c"]
    }

    triggers = {
        dir_hash = sha1(join("", [for f in fileset("/tmp/my-dir/", "**"): filesha1("my-dir/${f}")]))
    }
}

data "local_file" "zip_file_location" {
  filename = "/tmp/my-dir/my-lambda-function.zip"
  depends_on = [ null_resource.build_zip ]
}

resource "aws_lambda_function" "test_lambda" {
  ...
  source_code_hash = filesha256(data.local_file.zip_file_location)
  ...
  depends_on = [ data.local_file.zip_file_location ]
}

Why

Some people might ask why don’t you use the preferred way from from AWS provider official documents like the following.

# https://registry.terraform.io/providers/hashicorp/aws/4.67.0/docs/resources/lambda_function
data "archive_file" "lambda" {
  type        = "zip"
  source_file = "lambda.js"
  output_path = "lambda_function_payload.zip"
}

As my company will inject some fields or shared private dependency libraries into the zip file, which does not commit to the GIT repo, we only fetch those resources from “build.bash” script.

Terraform data external might not act as you expect

Background

Problem

Expected

Actual

Possible reason

Resolution

Better resolution ( only update when the code changes)

Why

reference

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by ukitdog

No responses yet