Terraform Interview Guide: Infrastructure as Code Fundamentals

·15 min read
devopsterraforminfrastructure-as-codeiaccloudinterview-preparation

Terraform has become the standard for Infrastructure as Code. Whether you're managing AWS, Azure, GCP, or Kubernetes, interviewers expect you to understand how Terraform works—not just how to copy examples from the docs.

This guide covers what actually comes up in DevOps and cloud engineering interviews: state management, modules, environments, and the patterns that separate junior from senior engineers.

Terraform Fundamentals

What is Infrastructure as Code?

Infrastructure as Code (IaC) means managing infrastructure through configuration files rather than manual processes. Benefits:

  • Version control: Track changes, review PRs, rollback
  • Reproducibility: Same config = same infrastructure
  • Automation: CI/CD for infrastructure
  • Documentation: Code is the documentation

Terraform vs Other IaC Tools

ToolTypeLanguageBest For
TerraformDeclarativeHCLMulti-cloud, any provider
CloudFormationDeclarativeYAML/JSONAWS-only shops
PulumiDeclarativePython/TS/GoDevelopers who prefer real languages
AnsibleProceduralYAMLConfiguration management
CDKDeclarativePython/TSAWS with programming languages

Example question: "When would you choose Terraform over CloudFormation?"

Terraform when: multi-cloud, need consistent tooling across providers, want to manage non-AWS resources (Kubernetes, Datadog, GitHub), or prefer HCL syntax and ecosystem.

CloudFormation when: AWS-only, need tight AWS integration (StackSets, drift detection), want native AWS support, or organization standardized on it.

Core Workflow

# 1. Initialize - download providers, set up backend
terraform init
 
# 2. Plan - preview changes without applying
terraform plan
 
# 3. Apply - create/update infrastructure
terraform apply
 
# 4. Destroy - tear down infrastructure
terraform destroy

What happens during init:

  • Downloads provider plugins
  • Initializes backend (local or remote)
  • Downloads modules
  • Creates .terraform directory

Providers and Resources

Provider: Plugin that knows how to talk to an API (AWS, Azure, Kubernetes)

# Configure the AWS provider
provider "aws" {
  region = "us-east-1"
}
 
# Create a resource
resource "aws_instance" "web" {
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t3.micro"
 
  tags = {
    Name = "web-server"
  }
}

Resource anatomy:

  • aws_instance - resource type (provider_resource)
  • "web" - local name (for referencing)
  • Block contents - configuration arguments

Referencing resources:

# Reference another resource's attribute
resource "aws_eip" "web_ip" {
  instance = aws_instance.web.id  # type.name.attribute
}

HCL Language Basics

Variables

Input variables: Parameters for your configuration

# variables.tf
variable "instance_type" {
  description = "EC2 instance type"
  type        = string
  default     = "t3.micro"
}
 
variable "environment" {
  description = "Environment name"
  type        = string
  # No default = required variable
}
 
variable "allowed_ports" {
  description = "List of allowed ports"
  type        = list(number)
  default     = [80, 443]
}
 
variable "tags" {
  description = "Resource tags"
  type        = map(string)
  default     = {}
}

Setting variables:

# Command line
terraform apply -var="environment=prod"
 
# Variable file
terraform apply -var-file="prod.tfvars"
 
# Environment variable
export TF_VAR_environment=prod
 
# Auto-loaded files: terraform.tfvars, *.auto.tfvars

Output variables: Export values for other configs or users

# outputs.tf
output "instance_ip" {
  description = "Public IP of the instance"
  value       = aws_instance.web.public_ip
}
 
output "database_password" {
  description = "Database password"
  value       = random_password.db.result
  sensitive   = true  # Won't show in logs
}

Local variables: Computed values for reuse within a module

locals {
  common_tags = {
    Environment = var.environment
    ManagedBy   = "terraform"
    Project     = var.project_name
  }
 
  name_prefix = "${var.project_name}-${var.environment}"
}
 
resource "aws_instance" "web" {
  # ...
  tags = merge(local.common_tags, {
    Name = "${local.name_prefix}-web"
  })
}

Data Types

# Primitives
string  = "hello"
number  = 42
bool    = true
 
# Collections
list    = ["a", "b", "c"]           # Ordered, same type
set     = toset(["a", "b", "c"])    # Unordered, unique
map     = { key = "value" }          # Key-value pairs
 
# Structural
object({
  name = string
  age  = number
})
 
tuple([string, number, bool])

Conditionals

# Ternary expression
resource "aws_instance" "web" {
  instance_type = var.environment == "prod" ? "t3.large" : "t3.micro"
}
 
# Conditional resource creation
resource "aws_eip" "web" {
  count    = var.create_eip ? 1 : 0
  instance = aws_instance.web.id
}

Loops: count vs for_each

count: Create multiple resources by index

resource "aws_instance" "web" {
  count         = 3
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t3.micro"
 
  tags = {
    Name = "web-${count.index}"  # web-0, web-1, web-2
  }
}
 
# Reference: aws_instance.web[0], aws_instance.web[1]

for_each: Create resources by key

variable "instances" {
  default = {
    web    = "t3.micro"
    api    = "t3.small"
    worker = "t3.medium"
  }
}
 
resource "aws_instance" "server" {
  for_each      = var.instances
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = each.value
 
  tags = {
    Name = each.key
  }
}
 
# Reference: aws_instance.server["web"], aws_instance.server["api"]

When to use which:

Use CaseRecommendation
N identical resourcescount
Resources with unique identityfor_each
Might remove items from middlefor_each
List of objectsfor_each with toset() or tomap()

The count index problem:

# With count = ["a", "b", "c"]
# Removing "b" causes "c" to shift from index 2 to 1
# Terraform sees: destroy old [2], modify [1]
# Result: Unintended recreation
 
# With for_each = toset(["a", "b", "c"])
# Removing "b" only affects resource["b"]
# Resources "a" and "c" unchanged

Data Sources

Query existing infrastructure (read-only):

# Get latest Amazon Linux AMI
data "aws_ami" "amazon_linux" {
  most_recent = true
  owners      = ["amazon"]
 
  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-*-x86_64-gp2"]
  }
}
 
# Use it
resource "aws_instance" "web" {
  ami = data.aws_ami.amazon_linux.id
  # ...
}
 
# Get current AWS account ID
data "aws_caller_identity" "current" {}
 
output "account_id" {
  value = data.aws_caller_identity.current.account_id
}

State Management

What is State?

State is a JSON file mapping configuration to real resources:

{
  "resources": [
    {
      "type": "aws_instance",
      "name": "web",
      "instances": [
        {
          "attributes": {
            "id": "i-1234567890abcdef0",
            "ami": "ami-0c55b159cbfafe1f0",
            "public_ip": "54.123.45.67"
          }
        }
      ]
    }
  ]
}

Why state matters:

  • Maps config to real resource IDs
  • Tracks dependencies for ordering
  • Caches attributes to reduce API calls
  • Detects drift from desired state

Remote State Backends

Never use local state in teams. Remote backends provide:

  • Shared access for team members
  • State locking to prevent conflicts
  • Encryption at rest
  • Versioning for recovery

S3 Backend (AWS):

terraform {
  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "prod/network/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-locks"  # For locking
  }
}

GCS Backend (GCP):

terraform {
  backend "gcs" {
    bucket = "my-terraform-state"
    prefix = "prod/network"
  }
}

Terraform Cloud:

terraform {
  cloud {
    organization = "my-org"
    workspaces {
      name = "prod-network"
    }
  }
}

State Locking

Prevents concurrent modifications:

Developer A                    Developer B
     |                              |
     |-- terraform apply -------->  |
     |   (acquires lock)            |
     |                              |-- terraform apply
     |                              |   (BLOCKED - lock held)
     |   (releases lock) -------->  |
     |                              |   (acquires lock, proceeds)

DynamoDB table for S3 backend locking:

resource "aws_dynamodb_table" "terraform_locks" {
  name         = "terraform-locks"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"
 
  attribute {
    name = "LockID"
    type = "S"
  }
}

State Commands

# List resources in state
terraform state list
 
# Show specific resource
terraform state show aws_instance.web
 
# Move resource (rename)
terraform state mv aws_instance.web aws_instance.app
 
# Remove from state (doesn't destroy resource)
terraform state rm aws_instance.web
 
# Import existing resource into state
terraform import aws_instance.web i-1234567890abcdef0
 
# Force unlock (dangerous - use if lock is stuck)
terraform force-unlock LOCK_ID
 
# Pull remote state locally
terraform state pull > state.json
 
# Push local state to remote (dangerous)
terraform state push state.json

Import workflow:

# 1. Write the resource configuration
resource "aws_instance" "existing" {
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t3.micro"
}
 
# 2. Import the existing resource
# terraform import aws_instance.existing i-1234567890abcdef0
 
# 3. Run plan to verify configuration matches
# terraform plan
# Adjust config until no changes shown

Modules

Why Use Modules?

  • Reusability: Write once, use many times
  • Encapsulation: Hide complexity behind simple interface
  • Consistency: Enforce standards across teams
  • Versioning: Control updates and changes

Module Structure

modules/
└── vpc/
    ├── main.tf          # Resources
    ├── variables.tf     # Input variables
    ├── outputs.tf       # Output values
    ├── versions.tf      # Provider requirements
    └── README.md        # Documentation

Example module:

# modules/vpc/variables.tf
variable "name" {
  description = "VPC name"
  type        = string
}
 
variable "cidr" {
  description = "VPC CIDR block"
  type        = string
  default     = "10.0.0.0/16"
}
 
variable "azs" {
  description = "Availability zones"
  type        = list(string)
}
 
# modules/vpc/main.tf
resource "aws_vpc" "this" {
  cidr_block           = var.cidr
  enable_dns_hostnames = true
  enable_dns_support   = true
 
  tags = {
    Name = var.name
  }
}
 
resource "aws_subnet" "public" {
  count             = length(var.azs)
  vpc_id            = aws_vpc.this.id
  cidr_block        = cidrsubnet(var.cidr, 8, count.index)
  availability_zone = var.azs[count.index]
 
  tags = {
    Name = "${var.name}-public-${var.azs[count.index]}"
  }
}
 
# modules/vpc/outputs.tf
output "vpc_id" {
  description = "VPC ID"
  value       = aws_vpc.this.id
}
 
output "public_subnet_ids" {
  description = "Public subnet IDs"
  value       = aws_subnet.public[*].id
}

Using the module:

module "vpc" {
  source = "./modules/vpc"
 
  name = "production"
  cidr = "10.0.0.0/16"
  azs  = ["us-east-1a", "us-east-1b", "us-east-1c"]
}
 
# Reference outputs
resource "aws_instance" "web" {
  subnet_id = module.vpc.public_subnet_ids[0]
  # ...
}

Module Sources

# Local path
module "vpc" {
  source = "./modules/vpc"
}
 
# Terraform Registry
module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "5.0.0"
}
 
# GitHub
module "vpc" {
  source = "github.com/org/repo//modules/vpc?ref=v1.0.0"
}
 
# S3 bucket
module "vpc" {
  source = "s3::https://s3-eu-west-1.amazonaws.com/bucket/vpc.zip"
}

Module Versioning

Always pin versions in production:

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "5.0.0"  # Exact version
 
  # Or version constraints
  # version = "~> 5.0"   # >= 5.0.0, < 6.0.0
  # version = ">= 5.0"   # >= 5.0.0
}

Workspaces & Environments

Terraform Workspaces

Workspaces allow multiple state files with same configuration:

# List workspaces
terraform workspace list
 
# Create workspace
terraform workspace new staging
 
# Switch workspace
terraform workspace select production
 
# Show current
terraform workspace show
 
# Delete workspace
terraform workspace delete staging

Using workspace in config:

resource "aws_instance" "web" {
  instance_type = terraform.workspace == "prod" ? "t3.large" : "t3.micro"
 
  tags = {
    Environment = terraform.workspace
  }
}

Directory Structure Pattern

For stronger isolation, use separate directories:

terraform/
├── modules/
│   └── app/
├── environments/
│   ├── dev/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── terraform.tfvars
│   │   └── backend.tf
│   ├── staging/
│   │   └── ...
│   └── prod/
│       └── ...

Each environment has its own:

  • State file (different backend key)
  • Variable values
  • Provider configuration if needed

Workspaces vs Directories

AspectWorkspacesDirectories
State isolationSame backend, different keysCompletely separate
Code duplicationNoneSome (can use modules)
Variable differencesConditional logicSeparate tfvars
Accidental cross-applyPossible (wrong workspace)Harder (different directory)
Best forSimilar environmentsVery different environments

Recommendation: Use directories for prod vs non-prod, workspaces for similar environments (dev1, dev2).


Best Practices & Patterns

Code Organization

project/
├── main.tf           # Primary resources
├── variables.tf      # All variable declarations
├── outputs.tf        # All outputs
├── versions.tf       # Terraform and provider versions
├── providers.tf      # Provider configurations
├── locals.tf         # Local values
├── data.tf           # Data sources
└── terraform.tfvars  # Variable values (don't commit secrets)

versions.tf:

terraform {
  required_version = ">= 1.5.0"
 
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

Naming Conventions

# Resources: descriptive, lowercase, underscores
resource "aws_instance" "web_server" { }
resource "aws_security_group" "web_sg" { }
 
# Variables: lowercase, underscores
variable "instance_type" { }
variable "environment_name" { }
 
# Outputs: lowercase, underscores, descriptive
output "load_balancer_dns" { }
 
# Locals: lowercase, underscores
locals {
  common_tags = { }
}

Secrets Management

Never do this:

# BAD - secrets in code
resource "aws_db_instance" "db" {
  password = "supersecret123"  # NO!
}

Better approaches:

# 1. Variable with no default (prompt or tfvars)
variable "db_password" {
  type      = string
  sensitive = true
}
 
# 2. Environment variable
# export TF_VAR_db_password=xxx
 
# 3. External secret store
data "aws_secretsmanager_secret_version" "db" {
  secret_id = "prod/db/password"
}
 
resource "aws_db_instance" "db" {
  password = data.aws_secretsmanager_secret_version.db.secret_string
}
 
# 4. Generate and store
resource "random_password" "db" {
  length  = 32
  special = true
}
 
resource "aws_secretsmanager_secret_version" "db" {
  secret_id     = aws_secretsmanager_secret.db.id
  secret_string = random_password.db.result
}

CI/CD Integration

GitHub Actions example:

name: Terraform
 
on:
  push:
    branches: [main]
  pull_request:
    branches: [main]
 
jobs:
  terraform:
    runs-on: ubuntu-latest
 
    steps:
      - uses: actions/checkout@v4
 
      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: 1.6.0
 
      - name: Terraform Init
        run: terraform init
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
 
      - name: Terraform Format Check
        run: terraform fmt -check
 
      - name: Terraform Plan
        run: terraform plan -no-color
        if: github.event_name == 'pull_request'
 
      - name: Terraform Apply
        run: terraform apply -auto-approve
        if: github.ref == 'refs/heads/main' && github.event_name == 'push'

Best practices for CI/CD:

  • Run terraform fmt -check to enforce formatting
  • Run terraform validate for syntax errors
  • Always run plan on PRs
  • Require approval before apply to production
  • Use OIDC for cloud authentication (no long-lived keys)

Common Interview Questions

Scenario: State Drift

Question: "Someone manually modified a resource. How do you handle it?"

# 1. Detect drift
terraform plan
# Shows: aws_instance.web will be updated (instance_type changed)
 
# 2. Decision point:
# Keep manual change → update your config to match
# Revert manual change → apply to correct drift
 
# 3. If keeping, update config:
resource "aws_instance" "web" {
  instance_type = "t3.large"  # Match manual change
}
 
# 4. Verify no changes
terraform plan
# No changes. Your infrastructure matches the configuration.

Prevention:

  • Lock down console access
  • Use CI/CD for all changes
  • Enable drift detection alerts
  • Regular terraform plan in CI

Scenario: Partial Apply Failure

Question: "Terraform apply failed halfway. What now?"

# 1. Check what was created
terraform state list
 
# 2. Check current state vs desired
terraform plan
 
# 3. Options:
# - Fix the error and re-run apply
# - If resource is broken, taint and recreate:
terraform taint aws_instance.web
terraform apply
 
# 4. If state is corrupted:
# - Restore from state backup (S3 versioning)
# - Or manually fix with state commands

Scenario: Resource Rename

Question: "How do you rename a resource without destroying it?"

# Before
resource "aws_instance" "web" { }
 
# After
resource "aws_instance" "application" { }
# 1. Move in state
terraform state mv aws_instance.web aws_instance.application
 
# 2. Update code to use new name
 
# 3. Verify no changes
terraform plan
# No changes.

Scenario: Migrate Local to Remote State

# 1. Add backend configuration
terraform {
  backend "s3" {
    bucket = "my-terraform-state"
    key    = "prod/terraform.tfstate"
    region = "us-east-1"
  }
}
 
# 2. Initialize with migration
terraform init -migrate-state
 
# Terraform will prompt to copy existing state to new backend

Quick Reference

Essential Commands

CommandPurpose
terraform initInitialize working directory
terraform planPreview changes
terraform applyApply changes
terraform destroyDestroy infrastructure
terraform fmtFormat code
terraform validateValidate syntax
terraform state listList resources in state
terraform importImport existing resource
terraform outputShow outputs

Common Patterns

# Conditional resource
count = var.create_resource ? 1 : 0
 
# Conditional attribute
instance_type = var.env == "prod" ? "t3.large" : "t3.micro"
 
# Dynamic blocks
dynamic "ingress" {
  for_each = var.ports
  content {
    from_port = ingress.value
    to_port   = ingress.value
    protocol  = "tcp"
  }
}
 
# Depends on (explicit dependency)
depends_on = [aws_iam_role_policy.example]
 
# Lifecycle rules
lifecycle {
  create_before_destroy = true
  prevent_destroy       = true
  ignore_changes        = [tags]
}

Related Articles

This guide connects to the broader DevOps interview preparation:

Cloud Platforms:

DevOps Fundamentals:

Architecture:


Final Thoughts

Terraform interviews test understanding of state management, module design, and operational patterns. Key areas:

  1. State is everything: Understand remote backends, locking, drift
  2. Modules for reuse: Know when and how to create them
  3. Environment management: Workspaces vs directories trade-offs
  4. Security: Never commit secrets, use external stores
  5. CI/CD: Automated plan/apply workflows

Practice by building real infrastructure. Break things, fix state issues, import existing resources. That hands-on experience shows in interviews.

Ready to ace your interview?

Get 550+ interview questions with detailed answers in our comprehensive PDF guides.

View PDF Guides