What is the difference between Terraform and Ansible?

Terraform is declarative infrastructure provisioning - you define desired state and it creates/modifies resources to match. Ansible is procedural configuration management - you define steps to execute. Terraform excels at creating cloud infrastructure (VMs, networks, databases). Ansible excels at configuring servers (installing packages, managing files). They're often used together: Terraform provisions infrastructure, Ansible configures it.

How do you handle secrets in Terraform?

Never store secrets in Terraform code or state files. Use environment variables for provider credentials. Reference secrets from external stores (AWS Secrets Manager, HashiCorp Vault) using data sources. Mark sensitive outputs with sensitive = true. Use encrypted remote state backends. For CI/CD, inject secrets at runtime rather than storing in repositories.

How do you manage multiple environments with Terraform?

Common approaches: Workspaces (terraform workspace) for simple cases with same backend. Separate directories (envs/dev, envs/prod) with different variable files for isolation. Terragrunt for DRY configurations across environments. Best practice is separate state files per environment, shared modules, and environment-specific tfvars files. Choose based on team size and isolation requirements.

What happens if Terraform state gets out of sync with real infrastructure?

This is called state drift. Run terraform plan to see differences. Use terraform refresh to update state from real infrastructure (now part of plan). For manual changes you want to keep, use terraform import to bring resources under management. For unwanted drift, terraform apply will correct infrastructure to match configuration. Prevent drift by making all changes through Terraform and using CI/CD pipelines.

35+ Terraform Interview Questions 2025: State Management, Modules & HCL

Q: What is Terraform state and why is it important?

Terraform state is a JSON file that maps your configuration to real infrastructure resources. It tracks resource IDs, dependencies, and metadata. Without state, Terraform can't know what exists or what to change. State enables plan/apply workflow, tracks dependencies for correct ordering, and stores sensitive outputs. Always use remote state with locking for team environments to prevent concurrent modifications.

Q: What is the difference between count and for_each?

Count creates resources by index number (0, 1, 2) - good for identical resources. For_each creates resources by key - better for distinct items. Count causes issues when removing middle items (indices shift, causing recreation). For_each is more stable since resources are keyed by name. Use count for simple multiples, for_each when items have identity or might change independently.

Terraform has become the standard for Infrastructure as Code. Whether you're managing AWS, Azure, GCP, or Kubernetes, interviewers expect you to understand how Terraform works—not just how to copy examples from the docs.

This guide covers what actually comes up in DevOps and cloud engineering interviews: state management, modules, environments, and the patterns that separate junior from senior engineers.

Terraform Fundamentals Questions
HCL Language Questions
State Management Questions
Module Questions
Workspace and Environment Questions
Best Practices Questions
Troubleshooting and Scenario Questions

Terraform Fundamentals Questions

Understanding Terraform's core concepts is essential for any DevOps interview.

What is Infrastructure as Code and why does it matter?

Infrastructure as Code (IaC) means managing infrastructure through configuration files rather than manual processes. Instead of clicking through cloud consoles or running ad-hoc commands, you define your entire infrastructure in version-controlled files that can be reviewed, tested, and applied consistently.

This approach transforms infrastructure management by bringing software engineering practices to operations. Your infrastructure becomes reproducible—the same configuration always produces the same result. Changes are tracked in version control, enabling code reviews and easy rollbacks. Automation becomes straightforward since you can integrate infrastructure changes into CI/CD pipelines.

Key benefits:

Version control: Track changes, review PRs, rollback
Reproducibility: Same config = same infrastructure
Automation: CI/CD for infrastructure
Documentation: Code is the documentation

How does Terraform compare to other IaC tools?

Each IaC tool has its strengths and ideal use cases. Terraform's main advantage is its provider ecosystem that works across any cloud or service with an API. CloudFormation is AWS-native and tightly integrated but locks you into one cloud. Ansible is procedural rather than declarative, making it better suited for configuration management than infrastructure provisioning.

The choice often depends on your organization's needs. Multi-cloud or hybrid environments benefit from Terraform's consistency. AWS-only shops might prefer CloudFormation's native integration. Teams with strong programming backgrounds might choose Pulumi or CDK for their familiar language syntax.

Tool	Type	Language	Best For
Terraform	Declarative	HCL	Multi-cloud, any provider
CloudFormation	Declarative	YAML/JSON	AWS-only shops
Pulumi	Declarative	Python/TS/Go	Developers who prefer real languages
Ansible	Procedural	YAML	Configuration management
CDK	Declarative	Python/TS	AWS with programming languages

When would you choose Terraform over CloudFormation?

This common interview question tests your understanding of tool selection based on requirements. Terraform excels when you need to work across multiple cloud providers or manage non-cloud resources like GitHub repositories, Datadog monitors, or Kubernetes clusters. Its provider ecosystem covers virtually any API-driven service.

Choose Terraform when you need multi-cloud support, consistent tooling across providers, or management of non-AWS resources. Choose CloudFormation when you're AWS-only, need tight AWS integration like StackSets and native drift detection, or when your organization has standardized on it.

What is the core Terraform workflow?

Terraform follows a simple but powerful workflow: initialize, plan, apply. Understanding this workflow and what happens at each stage demonstrates operational competency to interviewers.

The init phase downloads provider plugins and modules, sets up the backend, and prepares the working directory. Plan compares your configuration to the current state and shows what changes would be made without actually making them. Apply executes those changes, updating infrastructure to match your configuration.

# 1. Initialize - download providers, set up backend
terraform init
 
# 2. Plan - preview changes without applying
terraform plan
 
# 3. Apply - create/update infrastructure
terraform apply
 
# 4. Destroy - tear down infrastructure
terraform destroy

What happens during init:

Downloads provider plugins
Initializes backend (local or remote)
Downloads modules
Creates .terraform directory

How do providers and resources work in Terraform?

Providers are plugins that know how to interact with specific APIs—AWS, Azure, Kubernetes, or any service with an API. Resources are the actual infrastructure components you want to manage, defined using the provider's resource types.

Each resource has a type (combining provider and resource kind) and a local name you use to reference it elsewhere in your configuration. The block contents specify the configuration arguments for that resource. Understanding this anatomy helps you read and write Terraform configurations fluently.

# Configure the AWS provider
provider "aws" {
  region = "us-east-1"
}
 
# Create a resource
resource "aws_instance" "web" {
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t3.micro"
 
  tags = {
    Name = "web-server"
  }
}

Resource anatomy:

aws_instance - resource type (provider_resource)
"web" - local name (for referencing)
Block contents - configuration arguments

Referencing resources:

# Reference another resource's attribute
resource "aws_eip" "web_ip" {
  instance = aws_instance.web.id  # type.name.attribute
}

HCL Language Questions

HCL (HashiCorp Configuration Language) is Terraform's domain-specific language for defining infrastructure.

How do you use variables in Terraform?

Variables make your Terraform configurations reusable and flexible. Input variables act as parameters—you define them in your configuration and provide values at runtime. Output variables export values for use by other configurations or for human consumption. Local variables are computed values for reuse within a module.

Understanding the different variable types and how to set them is fundamental Terraform knowledge. Variables can have defaults, validation rules, and type constraints that catch errors early.

Input variables: Parameters for your configuration

# variables.tf
variable "instance_type" {
  description = "EC2 instance type"
  type        = string
  default     = "t3.micro"
}
 
variable "environment" {
  description = "Environment name"
  type        = string
  # No default = required variable
}
 
variable "allowed_ports" {
  description = "List of allowed ports"
  type        = list(number)
  default     = [80, 443]
}
 
variable "tags" {
  description = "Resource tags"
  type        = map(string)
  default     = {}
}

Setting variables:

# Command line
terraform apply -var="environment=prod"
 
# Variable file
terraform apply -var-file="prod.tfvars"
 
# Environment variable
export TF_VAR_environment=prod
 
# Auto-loaded files: terraform.tfvars, *.auto.tfvars

Output variables: Export values for other configs or users

# outputs.tf
output "instance_ip" {
  description = "Public IP of the instance"
  value       = aws_instance.web.public_ip
}
 
output "database_password" {
  description = "Database password"
  value       = random_password.db.result
  sensitive   = true  # Won't show in logs
}

Local variables: Computed values for reuse within a module

locals {
  common_tags = {
    Environment = var.environment
    ManagedBy   = "terraform"
    Project     = var.project_name
  }
 
  name_prefix = "${var.project_name}-${var.environment}"
}
 
resource "aws_instance" "web" {
  # ...
  tags = merge(local.common_tags, {
    Name = "${local.name_prefix}-web"
  })
}

What data types does Terraform support?

Terraform supports primitive types (string, number, bool) and collection types (list, set, map) as well as structural types (object, tuple). Understanding these types helps you write type-safe configurations and catch errors during planning rather than at apply time.

Lists maintain order and allow duplicates. Sets are unordered and unique. Maps store key-value pairs. Objects combine named attributes with different types, while tuples are ordered collections with mixed types.

# Primitives
string  = "hello"
number  = 42
bool    = true
 
# Collections
list    = ["a", "b", "c"]           # Ordered, same type
set     = toset(["a", "b", "c"])    # Unordered, unique
map     = { key = "value" }          # Key-value pairs
 
# Structural
object({
  name = string
  age  = number
})
 
tuple([string, number, bool])

How do you write conditional expressions in Terraform?

Conditional expressions let you make decisions in your configuration based on variable values or other conditions. Terraform uses the ternary syntax common in many programming languages: condition ? true_value : false_value.

You can use conditionals for attribute values or combined with count to conditionally create entire resources. This pattern is essential for writing flexible modules that adapt to different environments or requirements.

# Ternary expression
resource "aws_instance" "web" {
  instance_type = var.environment == "prod" ? "t3.large" : "t3.micro"
}
 
# Conditional resource creation
resource "aws_eip" "web" {
  count    = var.create_eip ? 1 : 0
  instance = aws_instance.web.id
}

What is the difference between count and for_each?

This is one of the most common Terraform interview questions because it reveals understanding of resource addressing and state management. Count creates resources indexed by number, while for_each creates resources indexed by key. This seemingly small difference has major implications for how Terraform handles changes.

Count works well for creating N identical resources, but causes problems when you remove items from the middle of a list—all subsequent indices shift, causing Terraform to destroy and recreate resources. For_each avoids this by keying resources by name, so removing one item only affects that specific resource.

count: Create multiple resources by index

resource "aws_instance" "web" {
  count         = 3
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t3.micro"
 
  tags = {
    Name = "web-${count.index}"  # web-0, web-1, web-2
  }
}
 
# Reference: aws_instance.web[0], aws_instance.web[1]

for_each: Create resources by key

variable "instances" {
  default = {
    web    = "t3.micro"
    api    = "t3.small"
    worker = "t3.medium"
  }
}
 
resource "aws_instance" "server" {
  for_each      = var.instances
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = each.value
 
  tags = {
    Name = each.key
  }
}
 
# Reference: aws_instance.server["web"], aws_instance.server["api"]

When to use which:

Use Case	Recommendation
N identical resources	count
Resources with unique identity	for_each
Might remove items from middle	for_each
List of objects	for_each with toset() or tomap()

The count index problem:

# With count = ["a", "b", "c"]
# Removing "b" causes "c" to shift from index 2 to 1
# Terraform sees: destroy old [2], modify [1]
# Result: Unintended recreation
 
# With for_each = toset(["a", "b", "c"])
# Removing "b" only affects resource["b"]
# Resources "a" and "c" unchanged

How do data sources work in Terraform?

Data sources let you query existing infrastructure or external information to use in your configuration. Unlike resources which create and manage infrastructure, data sources are read-only—they fetch information that already exists.

Common uses include looking up the latest AMI, getting the current AWS account ID, or reading information about existing infrastructure that Terraform doesn't manage. This is essential for integrating Terraform with manually-created resources or resources managed by other teams.

# Get latest Amazon Linux AMI
data "aws_ami" "amazon_linux" {
  most_recent = true
  owners      = ["amazon"]
 
  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-*-x86_64-gp2"]
  }
}
 
# Use it
resource "aws_instance" "web" {
  ami = data.aws_ami.amazon_linux.id
  # ...
}
 
# Get current AWS account ID
data "aws_caller_identity" "current" {}
 
output "account_id" {
  value = data.aws_caller_identity.current.account_id
}

State Management Questions

State management is arguably the most critical aspect of Terraform operations.

What is Terraform state and why is it important?

State is a JSON file that maintains the mapping between your configuration and real infrastructure. Without state, Terraform couldn't know which real resources correspond to which configuration blocks, what order to create or update resources, or what the current values of resource attributes are.

The state file contains resource IDs that let Terraform interact with the cloud provider API, dependency information for determining operation order, and cached attribute values that reduce API calls. Understanding state deeply is essential for troubleshooting and disaster recovery.

{
  "resources": [
    {
      "type": "aws_instance",
      "name": "web",
      "instances": [
        {
          "attributes": {
            "id": "i-1234567890abcdef0",
            "ami": "ami-0c55b159cbfafe1f0",
            "public_ip": "54.123.45.67"
          }
        }
      ]
    }
  ]
}

Why state matters:

Maps config to real resource IDs
Tracks dependencies for ordering
Caches attributes to reduce API calls
Detects drift from desired state

Why should you use remote state backends?

Local state files work for individual learning but cause serious problems in team environments. Without remote state, team members can't collaborate—they don't have access to each other's state files. Concurrent applies can corrupt state or cause duplicate resources. There's no locking to prevent conflicts and no backup if the local file is lost.

Remote backends solve all these problems by storing state in a shared location with locking to prevent concurrent modifications. They also provide encryption at rest, versioning for recovery, and access control.

Never use local state in teams. Remote backends provide:

Shared access for team members
State locking to prevent conflicts
Encryption at rest
Versioning for recovery

S3 Backend (AWS):

terraform {
  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "prod/network/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-locks"  # For locking
  }
}

GCS Backend (GCP):

terraform {
  backend "gcs" {
    bucket = "my-terraform-state"
    prefix = "prod/network"
  }
}

Terraform Cloud:

terraform {
  cloud {
    organization = "my-org"
    workspaces {
      name = "prod-network"
    }
  }
}

How does state locking work?

State locking prevents multiple users or processes from modifying state simultaneously, which could corrupt it or cause duplicate resources. When you run a Terraform operation that modifies state, Terraform first acquires a lock. If someone else has the lock, you wait until it's released.

For S3 backends, locking uses DynamoDB. Other backends have their own locking mechanisms. Understanding locking helps you troubleshoot situations where Terraform says the state is locked and you need to determine if another apply is running or if a lock is stuck.

sequenceDiagram
    participant A as Developer A
    participant S as State Backend
    participant B as Developer B
 
    A->>S: terraform apply
    S-->>A: Lock acquired ✓
    Note over A,S: Lock held by A
 
    B->>S: terraform apply
    S--xB: BLOCKED (lock held)
    Note over B: Waiting for lock...
 
    A->>S: Apply complete, release lock
    S-->>A: Lock released
 
    B->>S: Retry - acquire lock
    S-->>B: Lock acquired ✓
    Note over B,S: B proceeds with apply

DynamoDB table for S3 backend locking:

resource "aws_dynamodb_table" "terraform_locks" {
  name         = "terraform-locks"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"
 
  attribute {
    name = "LockID"
    type = "S"
  }
}

What state commands should you know for interviews?

Terraform provides several commands for inspecting and manipulating state. These are essential for troubleshooting, refactoring, and disaster recovery. Interviewers often ask about specific commands and when you'd use them.

The most commonly used commands are state list and state show for inspection, state mv for refactoring, state rm for removing resources from management without destroying them, and import for bringing existing infrastructure under Terraform control.

# List resources in state
terraform state list
 
# Show specific resource
terraform state show aws_instance.web
 
# Move resource (rename)
terraform state mv aws_instance.web aws_instance.app
 
# Remove from state (doesn't destroy resource)
terraform state rm aws_instance.web
 
# Import existing resource into state
terraform import aws_instance.web i-1234567890abcdef0
 
# Force unlock (dangerous - use if lock is stuck)
terraform force-unlock LOCK_ID
 
# Pull remote state locally
terraform state pull > state.json
 
# Push local state to remote (dangerous)
terraform state push state.json

How do you import existing resources into Terraform?

Importing lets you bring manually-created resources under Terraform management without destroying and recreating them. This is essential when adopting Terraform for existing infrastructure or when resources were created outside your Terraform workflow.

The import process requires writing the resource configuration first, then running the import command with the resource address and real-world identifier. Finally, you run plan to verify your configuration matches the actual resource and adjust until there are no changes.

# 1. Write the resource configuration
resource "aws_instance" "existing" {
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t3.micro"
}
 
# 2. Import the existing resource
# terraform import aws_instance.existing i-1234567890abcdef0
 
# 3. Run plan to verify configuration matches
# terraform plan
# Adjust config until no changes shown

Module Questions

Modules are the primary way to organize and reuse Terraform code.

Why should you use Terraform modules?

Modules package related resources together into reusable, shareable components. Instead of copying the same VPC configuration into every project, you define it once as a module and call it with different parameters. This reduces duplication, enforces consistency, and makes large configurations manageable.

Good modules hide implementation complexity behind a simple interface. Users don't need to understand all the resources involved—they just provide the required inputs and consume the outputs. This encapsulation also makes it easier to update implementations without affecting every project that uses the module.

Key benefits:

Reusability: Write once, use many times
Encapsulation: Hide complexity behind simple interface
Consistency: Enforce standards across teams
Versioning: Control updates and changes

How should you structure a Terraform module?

Module structure follows conventions that make modules predictable and easy to use. Every module needs at least main.tf for resources, variables.tf for inputs, and outputs.tf for values other configurations can use. Additional files like versions.tf for provider requirements and README.md for documentation are best practices.

Following these conventions means anyone familiar with Terraform can quickly understand your module's interface and implementation.

modules/
└── vpc/
    ├── main.tf          # Resources
    ├── variables.tf     # Input variables
    ├── outputs.tf       # Output values
    ├── versions.tf      # Provider requirements
    └── README.md        # Documentation

Example module:

# modules/vpc/variables.tf
variable "name" {
  description = "VPC name"
  type        = string
}
 
variable "cidr" {
  description = "VPC CIDR block"
  type        = string
  default     = "10.0.0.0/16"
}
 
variable "azs" {
  description = "Availability zones"
  type        = list(string)
}
 
# modules/vpc/main.tf
resource "aws_vpc" "this" {
  cidr_block           = var.cidr
  enable_dns_hostnames = true
  enable_dns_support   = true
 
  tags = {
    Name = var.name
  }
}
 
resource "aws_subnet" "public" {
  count             = length(var.azs)
  vpc_id            = aws_vpc.this.id
  cidr_block        = cidrsubnet(var.cidr, 8, count.index)
  availability_zone = var.azs[count.index]
 
  tags = {
    Name = "${var.name}-public-${var.azs[count.index]}"
  }
}
 
# modules/vpc/outputs.tf
output "vpc_id" {
  description = "VPC ID"
  value       = aws_vpc.this.id
}
 
output "public_subnet_ids" {
  description = "Public subnet IDs"
  value       = aws_subnet.public[*].id
}

Using the module:

module "vpc" {
  source = "./modules/vpc"
 
  name = "production"
  cidr = "10.0.0.0/16"
  azs  = ["us-east-1a", "us-east-1b", "us-east-1c"]
}
 
# Reference outputs
resource "aws_instance" "web" {
  subnet_id = module.vpc.public_subnet_ids[0]
  # ...
}

What module sources does Terraform support?

Terraform can load modules from various sources, giving you flexibility in how you organize and share code. Local paths are simplest for development. The Terraform Registry provides public and private modules with versioning. Git repositories (GitHub, GitLab, Bitbucket) work for private modules with tag-based versioning.

Each source type has trade-offs. Local paths are convenient but don't version well. Registry modules have excellent versioning but require publishing. Git sources balance flexibility and versioning but require careful reference management.

# Local path
module "vpc" {
  source = "./modules/vpc"
}
 
# Terraform Registry
module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "5.0.0"
}
 
# GitHub
module "vpc" {
  source = "github.com/org/repo//modules/vpc?ref=v1.0.0"
}
 
# S3 bucket
module "vpc" {
  source = "s3::https://s3-eu-west-1.amazonaws.com/bucket/vpc.zip"
}

How should you version modules in production?

Version pinning is essential for stable production infrastructure. Without it, module updates could unexpectedly change your infrastructure. Always specify exact versions or constrained ranges in production configurations.

Exact versions provide the most stability but require manual updates. Pessimistic constraints (~>) allow patch updates while preventing breaking changes. Test module updates in non-production environments before promoting version changes to production.

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "5.0.0"  # Exact version
 
  # Or version constraints
  # version = "~> 5.0"   # >= 5.0.0, < 6.0.0
  # version = ">= 5.0"   # >= 5.0.0
}

Workspace and Environment Questions

Managing multiple environments is a common challenge in Terraform.

How do Terraform workspaces work?

Workspaces let you maintain multiple state files for the same configuration. Each workspace has its own state, so you can deploy the same infrastructure to dev, staging, and production with environment-specific differences controlled by the workspace name.

Workspaces are convenient for environments that are structurally similar. They share the same backend configuration and use conditional logic based on terraform.workspace to vary settings. However, they provide less isolation than separate directory structures.

# List workspaces
terraform workspace list
 
# Create workspace
terraform workspace new staging
 
# Switch workspace
terraform workspace select production
 
# Show current
terraform workspace show
 
# Delete workspace
terraform workspace delete staging

Using workspace in config:

resource "aws_instance" "web" {
  instance_type = terraform.workspace == "prod" ? "t3.large" : "t3.micro"
 
  tags = {
    Environment = terraform.workspace
  }
}

When should you use directory structure instead of workspaces?

The directory approach creates stronger isolation between environments by giving each its own configuration files, backend configuration, and potentially different module versions. This isolation reduces the risk of accidentally applying to the wrong environment and makes environment-specific customization more explicit.

Choose directories when environments have significant structural differences, when you need strict isolation (especially for production), or when different teams manage different environments. Use workspaces when environments are nearly identical and isolation requirements are lower.

terraform/
├── modules/
│   └── app/
├── environments/
│   ├── dev/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── terraform.tfvars
│   │   └── backend.tf
│   ├── staging/
│   │   └── ...
│   └── prod/
│       └── ...

Each environment has its own:

State file (different backend key)
Variable values
Provider configuration if needed

What are the trade-offs between workspaces and directories?

This question tests your ability to evaluate solutions based on specific requirements rather than following a single pattern blindly. Both approaches have valid use cases, and senior engineers understand when to use each.

Aspect	Workspaces	Directories
State isolation	Same backend, different keys	Completely separate
Code duplication	None	Some (can use modules)
Variable differences	Conditional logic	Separate tfvars
Accidental cross-apply	Possible (wrong workspace)	Harder (different directory)
Best for	Similar environments	Very different environments

Recommendation: Use directories for prod vs non-prod, workspaces for similar environments (dev1, dev2).

Best Practices Questions

Following established patterns separates professional Terraform users from beginners.

How should you organize Terraform code in a project?

Code organization affects maintainability and collaboration. The standard pattern separates concerns into distinct files: main.tf for resources, variables.tf for inputs, outputs.tf for exports, and so on. This convention makes large configurations navigable and helps team members find what they're looking for quickly.

Consistent organization across projects reduces cognitive load when switching between codebases and makes onboarding new team members faster.

project/
├── main.tf           # Primary resources
├── variables.tf      # All variable declarations
├── outputs.tf        # All outputs
├── versions.tf       # Terraform and provider versions
├── providers.tf      # Provider configurations
├── locals.tf         # Local values
├── data.tf           # Data sources
└── terraform.tfvars  # Variable values (don't commit secrets)

versions.tf:

terraform {
  required_version = ">= 1.5.0"
 
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

What naming conventions should you follow?

Consistent naming makes configurations readable and maintainable. Terraform conventions use lowercase with underscores for all identifiers. Resource names should be descriptive but concise, indicating what the resource is for. Variables and outputs follow the same pattern.

Following community conventions means your code is immediately readable by other Terraform users and tools that expect standard patterns will work correctly.

# Resources: descriptive, lowercase, underscores
resource "aws_instance" "web_server" { }
resource "aws_security_group" "web_sg" { }
 
# Variables: lowercase, underscores
variable "instance_type" { }
variable "environment_name" { }
 
# Outputs: lowercase, underscores, descriptive
output "load_balancer_dns" { }
 
# Locals: lowercase, underscores
locals {
  common_tags = { }
}

How should you handle secrets in Terraform?

Secrets management is critical and frequently tested in interviews because getting it wrong has serious security implications. Never hardcode secrets in Terraform files or commit them to version control. The state file contains sensitive data too, so always encrypt remote state at rest.

The best approaches reference secrets from external stores at runtime, use environment variables for CI/CD, or generate secrets with Terraform and immediately store them in a secrets manager. Mark sensitive outputs to prevent accidental exposure in logs.

Never do this:

# BAD - secrets in code
resource "aws_db_instance" "db" {
  password = "supersecret123"  # NO!
}

Better approaches:

# 1. Variable with no default (prompt or tfvars)
variable "db_password" {
  type      = string
  sensitive = true
}
 
# 2. Environment variable
# export TF_VAR_db_password=xxx
 
# 3. External secret store
data "aws_secretsmanager_secret_version" "db" {
  secret_id = "prod/db/password"
}
 
resource "aws_db_instance" "db" {
  password = data.aws_secretsmanager_secret_version.db.secret_string
}
 
# 4. Generate and store
resource "random_password" "db" {
  length  = 32
  special = true
}
 
resource "aws_secretsmanager_secret_version" "db" {
  secret_id     = aws_secretsmanager_secret.db.id
  secret_string = random_password.db.result
}

How do you integrate Terraform with CI/CD?

CI/CD integration is standard for production Terraform usage. Pipelines enforce code quality checks, provide visibility into planned changes through PR comments, and ensure changes are applied consistently. Manual applies become exceptions rather than the norm.

Key practices include running format and validate on every commit, running plan on pull requests so reviewers see proposed changes, requiring approval before apply to production, and using OIDC authentication to avoid long-lived credentials.

GitHub Actions example:

name: Terraform
 
on:
  push:
    branches: [main]
  pull_request:
    branches: [main]
 
jobs:
  terraform:
    runs-on: ubuntu-latest
 
    steps:
      - uses: actions/checkout@v4
 
      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: 1.6.0
 
      - name: Terraform Init
        run: terraform init
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
 
      - name: Terraform Format Check
        run: terraform fmt -check
 
      - name: Terraform Plan
        run: terraform plan -no-color
        if: github.event_name == 'pull_request'
 
      - name: Terraform Apply
        run: terraform apply -auto-approve
        if: github.ref == 'refs/heads/main' && github.event_name == 'push'

Best practices for CI/CD:

Run terraform fmt -check to enforce formatting
Run terraform validate for syntax errors
Always run plan on PRs
Require approval before apply to production
Use OIDC for cloud authentication (no long-lived keys)

Troubleshooting and Scenario Questions

Scenario questions test your practical experience with real-world problems.

How do you handle state drift?

State drift occurs when real infrastructure differs from what Terraform expects, usually because someone made manual changes. This is a common operational challenge that interviewers want to know you can handle.

The response depends on whether you want to keep the manual change or revert it. If keeping it, update your configuration to match reality. If reverting, let Terraform apply correct the drift. Prevention is better than cure—lock down manual access and make all changes through Terraform pipelines.

# 1. Detect drift
terraform plan
# Shows: aws_instance.web will be updated (instance_type changed)
 
# 2. Decision point:
# Keep manual change → update your config to match
# Revert manual change → apply to correct drift
 
# 3. If keeping, update config:
resource "aws_instance" "web" {
  instance_type = "t3.large"  # Match manual change
}
 
# 4. Verify no changes
terraform plan
# No changes. Your infrastructure matches the configuration.

Prevention:

Lock down console access
Use CI/CD for all changes
Enable drift detection alerts
Regular terraform plan in CI

What do you do when Terraform apply fails halfway?

Partial failures leave your infrastructure in an inconsistent state—some resources created, others not. Understanding recovery is essential for production operations. Terraform's state accurately reflects what was created, so you can inspect it to understand the current situation.

The recovery path depends on the failure cause. Often you can fix the configuration error and re-run apply. Sometimes you need to taint a resource to force recreation. In rare cases, you might need to manually fix state or restore from a backup.

# 1. Check what was created
terraform state list
 
# 2. Check current state vs desired
terraform plan
 
# 3. Options:
# - Fix the error and re-run apply
# - If resource is broken, taint and recreate:
terraform taint aws_instance.web
terraform apply
 
# 4. If state is corrupted:
# - Restore from state backup (S3 versioning)
# - Or manually fix with state commands

How do you rename a resource without destroying it?

Resource renaming is a common refactoring need, but Terraform interprets a name change as "delete old, create new." The state mv command tells Terraform the resource moved rather than being replaced, preserving the actual infrastructure.

This technique is essential for code cleanup without infrastructure impact. Always verify with a plan after moving to ensure Terraform sees no changes needed.

# Before
resource "aws_instance" "web" { }
 
# After
resource "aws_instance" "application" { }

# 1. Move in state
terraform state mv aws_instance.web aws_instance.application
 
# 2. Update code to use new name
 
# 3. Verify no changes
terraform plan
# No changes.

How do you migrate from local to remote state?

State migration is needed when adopting proper Terraform practices or changing backends. Terraform handles this gracefully with the init -migrate-state flag, which copies your existing state to the new backend.

Ensure the new backend is configured correctly before migrating. After migration, verify the remote state contains your resources and consider deleting the local state file to avoid confusion.

# 1. Add backend configuration
terraform {
  backend "s3" {
    bucket = "my-terraform-state"
    key    = "prod/terraform.tfstate"
    region = "us-east-1"
  }
}
 
# 2. Initialize with migration
terraform init -migrate-state
 
# Terraform will prompt to copy existing state to new backend

Quick Reference

What are the essential Terraform commands?

Command	Purpose
`terraform init`	Initialize working directory
`terraform plan`	Preview changes
`terraform apply`	Apply changes
`terraform destroy`	Destroy infrastructure
`terraform fmt`	Format code
`terraform validate`	Validate syntax
`terraform state list`	List resources in state
`terraform import`	Import existing resource
`terraform output`	Show outputs

What are common Terraform patterns?

# Conditional resource
count = var.create_resource ? 1 : 0
 
# Conditional attribute
instance_type = var.env == "prod" ? "t3.large" : "t3.micro"
 
# Dynamic blocks
dynamic "ingress" {
  for_each = var.ports
  content {
    from_port = ingress.value
    to_port   = ingress.value
    protocol  = "tcp"
  }
}
 
# Depends on (explicit dependency)
depends_on = [aws_iam_role_policy.example]
 
# Lifecycle rules
lifecycle {
  create_before_destroy = true
  prevent_destroy       = true
  ignore_changes        = [tags]
}

This guide connects to the broader DevOps interview preparation:

Cloud Platforms:

AWS Interview Guide - AWS resources and services
Azure Interview Guide - Azure ARM comparison
GCP Interview Guide - GCP Deployment Manager comparison

DevOps Fundamentals:

CI/CD & GitHub Actions Interview Guide - Terraform in pipelines
Docker Interview Guide - Container infrastructure
Kubernetes Interview Guide - K8s provider

Architecture:

System Design Interview Guide - Infrastructure patterns
Networking Interview Guide - VPC and network resources

Table of Contents