Terraform data external might not act as you expect

ukitdog
2 min readDec 27, 2024

--

Background

I am currently looking to upload a lambda function to AWS.

It will trigger the build bash script, and generate a JSON file with the following.

{
"path": "my_path",
"zip_file": "/tmp/my-lambda-function.zip"
}

Problem

We are using the following terraform resources to invoke the customized build script (build.bash) to build the zip file.

data "external" "build_zip" {
program = ["bash", "-c", "build.bash"]
}

resource "aws_lambda_function" "test_lambda" {
...
source_code_hash = filesha256(data.external.build_zip.result.zip_file)
...
}

Expected

We expected a new lambda function should be deployed to remote for each trigger.

Actual

The terraform does not ensure the lambda function will be deployed every time and it becomes a random behavior, it might or might not be deployed to remote unless we check the code from the AWS console.

Possible reason

data "external" "build_zip" { <-- this might only trigger once
program = ["bash", "-c", "build.bash"]
# following statement does not support
triggers = {
always_run = timestamp() <-- ensure it will trigger each time
}
}

Resolution

resource "null_resource" "build_zip" {
provisioner "local-exec" {
command = "build.bash""
interpreter = ["/bin/bash", "-c"]
}

triggers = {
always_run = timestamp()
}
}

data "local_file" "zip_file_location" {
filename = "/tmp/my-lambda-function.zip"
depends_on = [ null_resource.build_zip ]
}

resource "aws_lambda_function" "test_lambda" {
...
source_code_hash = filesha256(data.local_file.zip_file_location)
...
depends_on = [ data.local_file.zip_file_location ]
}

The above code can ensure build script will be triggered every time.

Better resolution ( only update when the code changes)

used sha1 to watch a directory

resource "null_resource" "build_zip" {
provisioner "local-exec" {
command = "build.bash""
interpreter = ["/bin/bash", "-c"]
}

triggers = {
dir_hash = sha1(join("", [for f in fileset("/tmp/my-dir/", "**"): filesha1("my-dir/${f}")]))
}
}

data "local_file" "zip_file_location" {
filename = "/tmp/my-dir/my-lambda-function.zip"
depends_on = [ null_resource.build_zip ]
}

resource "aws_lambda_function" "test_lambda" {
...
source_code_hash = filesha256(data.local_file.zip_file_location)
...
depends_on = [ data.local_file.zip_file_location ]
}

Why

Some people might ask why don’t you use the preferred way from from AWS provider official documents like the following.

# https://registry.terraform.io/providers/hashicorp/aws/4.67.0/docs/resources/lambda_function
data "archive_file" "lambda" {
type = "zip"
source_file = "lambda.js"
output_path = "lambda_function_payload.zip"
}

As my company will inject some fields or shared private dependency libraries into the zip file, which does not commit to the GIT repo, we only fetch those resources from “build.bash” script.

reference

  1. https://stackoverflow.com/questions/65836439/re-run-the-program-of-an-external-data-source-on-every-plan-refresh-state-te
  2. https://stackoverflow.com/questions/65316834/call-to-function-file-failed-no-file-exists-terraform
  3. https://stackoverflow.com/questions/51138667/can-terraform-watch-a-directory-for-changes

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

ukitdog
ukitdog

No responses yet

Write a response