AWS has over 200 services. Interviews focus on maybe 15 of them.
The difference between candidates who pass AWS interviews and those who don't isn't breadth of knowledge—it's depth in the services that matter. You can know what every service does and still fail if you can't explain how VPC networking actually works or when to use Lambda versus EC2.
This guide covers the core AWS services that appear in nearly every cloud interview, the concepts you need to understand deeply, and the questions interviewers actually ask.
AWS Fundamentals
Before diving into specific services, understand how AWS is organized.
Regions and Availability Zones
Region: A geographic area containing multiple data centers (e.g., us-east-1 in Virginia, eu-west-1 in Ireland). Each region is completely independent.
Availability Zone (AZ): One or more discrete data centers within a region. Each AZ has independent power, cooling, and networking. AZs within a region are connected by low-latency links.
Edge Location: CDN endpoints for CloudFront and Route 53. More locations than regions, closer to users.
AWS Global Infrastructure:
├── Region (us-east-1)
│ ├── AZ (us-east-1a)
│ ├── AZ (us-east-1b)
│ └── AZ (us-east-1c)
├── Region (eu-west-1)
│ ├── AZ (eu-west-1a)
│ └── ...
└── Edge Locations (200+)
Interview question: "Why deploy across multiple AZs?"
For high availability. If one AZ fails (power outage, network issues), your application continues running in other AZs. Single-AZ deployments have a single point of failure.
Service Scope
Services operate at different scopes:
| Scope | Examples | Implication |
|---|---|---|
| Global | IAM, Route 53, CloudFront | Data replicated across all regions |
| Regional | S3, Lambda, VPC | Must be created in each region you use |
| AZ-scoped | EC2, EBS, Subnets | Tied to specific AZ, not automatically replicated |
Interview trap: "Is S3 regional or global?"
S3 buckets are regional (data stays in the region), but bucket names are globally unique. The S3 console shows all buckets globally, which confuses people.
Well-Architected Framework
AWS's five pillars of good architecture—interviewers expect you to know these:
| Pillar | Focus |
|---|---|
| Operational Excellence | Run and monitor systems, continuous improvement |
| Security | Protect information, systems, assets |
| Reliability | Recover from failures, meet demand |
| Performance Efficiency | Use resources efficiently |
| Cost Optimization | Avoid unnecessary costs |
When answering architecture questions, frame your answers around these pillars.
Compute: EC2 and Lambda
The two most important compute services. Know when to use each.
EC2 Fundamentals
EC2 (Elastic Compute Cloud) provides virtual servers. Key concepts:
Instance Types: Named by family, generation, and size (e.g., m5.xlarge).
| Family | Use Case | Examples |
|---|---|---|
| M (General) | Balanced compute, memory, networking | Web servers, small databases |
| C (Compute) | CPU-intensive workloads | Batch processing, gaming servers |
| R (Memory) | Memory-intensive workloads | In-memory databases, caching |
| T (Burstable) | Variable workloads with burst capability | Dev environments, small apps |
| G/P (GPU) | Graphics, machine learning | ML training, video encoding |
Interview question: "When would you use a T instance versus an M instance?"
T instances are burstable—they accumulate CPU credits when idle and spend them during bursts. Good for variable workloads. M instances provide consistent performance. Use T for dev/test or apps with occasional spikes; use M for production with steady load.
EC2 Pricing Models
| Model | Description | Best For |
|---|---|---|
| On-Demand | Pay per hour/second, no commitment | Short-term, unpredictable workloads |
| Reserved | 1-3 year commitment, up to 72% discount | Steady-state, predictable workloads |
| Spot | Bid on unused capacity, up to 90% discount | Fault-tolerant, flexible workloads |
| Savings Plans | Commit to $/hour usage, flexible across instance types | Similar to Reserved but more flexible |
Interview question: "Your batch processing job can tolerate interruptions. How would you reduce costs?"
Use Spot Instances. They're up to 90% cheaper but can be terminated with 2-minute notice. For fault-tolerant batch jobs, use Spot with checkpointing—save progress regularly so you can resume if interrupted.
Lambda Fundamentals
Lambda runs code without provisioning servers. Key characteristics:
- Event-driven: Triggered by events (API Gateway, S3, SQS, etc.)
- Pay per invocation: Charged for requests and compute time
- Auto-scaling: Scales automatically from 0 to thousands of concurrent executions
- Time limit: Maximum 15 minutes per invocation
Cold Starts: First invocation after idle period is slower (Lambda must initialize the runtime). Subsequent invocations reuse the warm container.
Cold start: Request → Initialize Runtime → Load Code → Run Handler → Response
Warm start: Request → Run Handler → Response
Reducing cold starts:
- Use Provisioned Concurrency (keeps instances warm)
- Smaller deployment packages
- Choose faster runtimes (Python, Node.js faster than Java)
EC2 vs Lambda Decision Tree
Need runtime > 15 minutes?
├── Yes → EC2 or ECS/EKS
└── No
└── Predictable, constant traffic?
├── Yes → EC2 (often cheaper at scale)
└── No
└── Need specific OS or runtime?
├── Yes → EC2
└── No → Lambda (simplicity wins)
Interview question: "When would Lambda be more expensive than EC2?"
At high, constant utilization. Lambda charges per invocation and GB-second. If you're running 24/7 at full capacity, a Reserved EC2 instance is usually cheaper. Lambda wins for variable, spiky, or low-utilization workloads.
Storage: S3 and EBS
S3 for objects, EBS for block storage. Different use cases, often confused.
S3 Fundamentals
S3 (Simple Storage Service) is object storage. Key concepts:
- Buckets: Containers for objects (globally unique names)
- Objects: Files up to 5TB with metadata
- Keys: The full path to an object (e.g.,
photos/2026/vacation.jpg)
Storage Classes:
| Class | Access Pattern | Retrieval | Cost |
|---|---|---|---|
| Standard | Frequent access | Immediate | Highest |
| Intelligent-Tiering | Unknown pattern | Immediate | Auto-optimized |
| Standard-IA | Infrequent (30+ days) | Immediate | Lower + retrieval fee |
| One Zone-IA | Infrequent, non-critical | Immediate | Lower, single AZ |
| Glacier Instant | Archive, rare access | Milliseconds | Low + retrieval fee |
| Glacier Flexible | Archive | Minutes to hours | Lower |
| Glacier Deep Archive | Long-term archive | 12+ hours | Lowest |
Lifecycle Policies: Automatically transition objects between classes or delete them after a period.
{
"Rules": [{
"Status": "Enabled",
"Transitions": [
{"Days": 30, "StorageClass": "STANDARD_IA"},
{"Days": 90, "StorageClass": "GLACIER"}
],
"Expiration": {"Days": 365}
}]
}S3 Security
Multiple layers of access control:
Bucket Policies: JSON policies attached to buckets. Control who can access and what they can do.
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {"AWS": "arn:aws:iam::123456789:role/MyRole"},
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::my-bucket/*"
}]
}ACLs: Legacy access control, generally avoid in favor of bucket policies.
Block Public Access: Account or bucket-level settings to prevent accidental public exposure. Enable by default.
Encryption:
- SSE-S3: AWS manages keys
- SSE-KMS: You control keys in KMS (audit trail)
- SSE-C: You provide keys with each request
- Client-side: Encrypt before uploading
Interview question: "How would you ensure no S3 bucket in your account is ever publicly accessible?"
Enable S3 Block Public Access at the account level. This prevents any bucket policy or ACL from granting public access. Also use AWS Config rules to detect and alert on misconfigurations.
EBS Fundamentals
EBS (Elastic Block Store) provides block storage volumes for EC2. Like a hard drive attached to your instance.
Volume Types:
| Type | Use Case | IOPS | Throughput |
|---|---|---|---|
| gp3 | General purpose SSD | Up to 16,000 | Up to 1,000 MB/s |
| gp2 | General purpose SSD (legacy) | Burst to 3,000 | 125-250 MB/s |
| io2 | High-performance SSD | Up to 64,000 | Up to 1,000 MB/s |
| st1 | Throughput HDD | N/A | Up to 500 MB/s |
| sc1 | Cold HDD | N/A | Up to 250 MB/s |
Key characteristics:
- Tied to a single AZ (not automatically replicated)
- Can only attach to one EC2 instance at a time (except io2 multi-attach)
- Snapshots stored in S3 (regional, can copy cross-region)
Interview question: "Your database needs consistent high IOPS. Which EBS volume type?"
io2 (or io2 Block Express for extreme performance). gp3 provides up to 16,000 IOPS but io2 can deliver up to 64,000. For databases like high-transaction OLTP, io2's provisioned IOPS guarantees consistent performance.
S3 vs EBS vs EFS
| Feature | S3 | EBS | EFS |
|---|---|---|---|
| Type | Object storage | Block storage | File storage |
| Access | HTTP API | Attach to EC2 | Mount as NFS |
| Sharing | Any number of clients | One EC2 (usually) | Multiple EC2s |
| Scope | Regional | Single AZ | Regional |
| Use Case | Static files, backups, data lakes | Boot volumes, databases | Shared file systems |
Networking: VPC
VPC (Virtual Private Cloud) is the foundation of AWS networking. Every EC2 instance, Lambda function, and RDS database runs inside a VPC.
VPC Components
VPC: Your isolated network in AWS. Define an IP range (CIDR block, e.g., 10.0.0.0/16).
Subnet: A segment of your VPC in a single AZ. Public subnets have routes to the internet; private subnets don't.
Route Table: Rules determining where network traffic goes. Each subnet associates with one route table.
Internet Gateway (IGW): Allows communication between your VPC and the internet. Attach to VPC, add route in public subnet route table.
NAT Gateway: Allows private subnet instances to reach the internet (for updates, APIs) without being reachable from the internet. Placed in public subnet.
VPC (10.0.0.0/16)
├── Public Subnet (10.0.1.0/24) - AZ-a
│ ├── Route: 0.0.0.0/0 → IGW
│ └── NAT Gateway
├── Public Subnet (10.0.2.0/24) - AZ-b
│ └── Route: 0.0.0.0/0 → IGW
├── Private Subnet (10.0.3.0/24) - AZ-a
│ └── Route: 0.0.0.0/0 → NAT Gateway
├── Private Subnet (10.0.4.0/24) - AZ-b
│ └── Route: 0.0.0.0/0 → NAT Gateway
└── Internet Gateway
Public vs Private Subnets
| Characteristic | Public Subnet | Private Subnet |
|---|---|---|
| Route to IGW | Yes | No |
| Public IP | Can have | No |
| Reachable from internet | Yes | No |
| Can reach internet | Yes | Via NAT Gateway |
| Typical use | Load balancers, bastion hosts | Application servers, databases |
Interview question: "Why put your database in a private subnet?"
Security. Databases shouldn't be directly accessible from the internet. Private subnets have no route to the Internet Gateway, so even if security groups are misconfigured, the database isn't publicly reachable. Defense in depth.
Security Groups vs NACLs
| Feature | Security Group | NACL |
|---|---|---|
| Scope | Instance level | Subnet level |
| Rules | Allow only | Allow and Deny |
| Statefulness | Stateful | Stateless |
| Evaluation | All rules evaluated | Rules evaluated in order |
| Default | Deny all inbound, allow all outbound | Allow all |
Stateful vs Stateless:
- Security Groups: If you allow inbound traffic, return traffic is automatically allowed
- NACLs: You must explicitly allow both inbound and outbound for return traffic
Interview question: "Traffic is blocked even though Security Group allows it. What could cause this?"
NACL might be blocking it. NACLs evaluate before Security Groups. Also check: route tables (traffic might not be routed correctly), ephemeral ports for return traffic in NACLs, or the source might not be what you expect (NAT changes source IP).
VPC Connectivity Options
| Method | Use Case |
|---|---|
| VPC Peering | Connect two VPCs (same or different accounts/regions) |
| Transit Gateway | Hub-and-spoke for multiple VPCs |
| VPN | Encrypted connection to on-premises over internet |
| Direct Connect | Dedicated private connection to on-premises |
| PrivateLink | Access AWS services or your services without internet |
Security: IAM
IAM (Identity and Access Management) controls who can do what in your AWS account. Security questions appear in every AWS interview.
IAM Concepts
Users: Individual identities with long-term credentials (password, access keys).
Groups: Collections of users. Attach policies to groups, not individual users.
Roles: Identities assumed by services, applications, or users. No long-term credentials—provide temporary credentials via STS.
Policies: JSON documents defining permissions. Attached to users, groups, or roles.
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": "arn:aws:s3:::my-bucket/*"
}]
}Policy Types
| Type | Description |
|---|---|
| Identity-based | Attached to users, groups, roles |
| Resource-based | Attached to resources (S3 bucket policy, SQS policy) |
| Permission boundaries | Maximum permissions an identity can have |
| Service control policies | Organization-level limits (AWS Organizations) |
Principle of Least Privilege
Grant only the permissions needed to perform a task. No more.
Bad: "Action": "*", "Resource": "*"
Good: "Action": "s3:GetObject", "Resource": "arn:aws:s3:::specific-bucket/*"
Interview tip: Always mention least privilege when discussing IAM. It's a fundamental security principle.
IAM Roles for Services
EC2 instances and Lambda functions should use roles, not hardcoded credentials.
EC2 Instance Profile: Attach a role to EC2. The instance can then call AWS APIs without credentials in code.
# No credentials needed - uses instance role
import boto3
s3 = boto3.client('s3')
s3.list_buckets()Interview question: "How should an EC2 instance access S3?"
Use an IAM role attached via instance profile. Never store access keys on the instance. The SDK automatically uses the role's temporary credentials from the instance metadata service.
Cross-Account Access
Common pattern: Account A needs to access resources in Account B.
-
Account B creates a role with:
- Trust policy allowing Account A to assume it
- Permission policy granting needed access
-
Account A calls
sts:AssumeRoleto get temporary credentials -
Account A uses those credentials to access Account B's resources
// Trust policy in Account B's role
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {"AWS": "arn:aws:iam::ACCOUNT_A_ID:root"},
"Action": "sts:AssumeRole"
}]
}Databases: RDS and DynamoDB
Know when to use relational versus NoSQL, and the AWS-specific features.
RDS Fundamentals
RDS (Relational Database Service) manages relational databases: MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, and Aurora.
What RDS manages for you:
- Provisioning and patching
- Backups (automated, point-in-time recovery)
- High availability (Multi-AZ)
- Read scaling (Read Replicas)
- Monitoring and metrics
What you still manage:
- Schema design
- Query optimization
- Application-level concerns
Multi-AZ vs Read Replicas
| Feature | Multi-AZ | Read Replica |
|---|---|---|
| Purpose | High availability | Read scaling |
| Replication | Synchronous | Asynchronous |
| Failover | Automatic | Manual promotion |
| Read traffic | No (standby not accessible) | Yes |
| Cross-region | No | Yes |
Interview question: "Your database needs both high availability and read scaling. What do you configure?"
Enable Multi-AZ for high availability (automatic failover to standby). Create Read Replicas for read scaling (distribute read traffic). These are independent features—you can and should use both.
Aurora
Aurora is AWS's cloud-native relational database. Compatible with MySQL and PostgreSQL but with better performance and availability.
Key differences from standard RDS:
- Storage auto-scales up to 128 TB
- 6 copies of data across 3 AZs
- Up to 15 Read Replicas (vs 5 for standard RDS)
- Faster failover (typically under 30 seconds)
- Aurora Serverless for variable workloads
Interview question: "When would you choose Aurora over standard RDS MySQL?"
When you need better availability (6-way replication), faster failover, more read replicas, or auto-scaling storage. Aurora costs more but provides enterprise-grade reliability. For smaller workloads where cost matters more, standard RDS is fine.
DynamoDB Fundamentals
DynamoDB is a fully managed NoSQL database. Key characteristics:
- Key-value and document store
- Single-digit millisecond latency at any scale
- Automatic scaling (on-demand or provisioned capacity)
- No servers to manage
Data Model:
- Table: Collection of items
- Item: A single record (like a row)
- Attributes: Data elements (like columns, but flexible)
- Primary Key: Partition key (required) + optional sort key
Table: Orders
├── Partition Key: customer_id
├── Sort Key: order_date
└── Items:
├── {customer_id: "123", order_date: "2026-01-07", total: 99.99, ...}
└── {customer_id: "123", order_date: "2026-01-06", items: [...], ...}
DynamoDB Keys and Indexes
Partition Key: Determines which partition stores the item. Must be unique (if no sort key) or unique in combination with sort key.
Sort Key: Orders items within a partition. Enables range queries within a partition.
GSI (Global Secondary Index): Different partition key than the table. Eventually consistent. Query any attribute.
LSI (Local Secondary Index): Same partition key, different sort key. Strongly consistent. Must create at table creation.
Interview question: "You need to query orders by customer_id and also by order_status. How would you design this?"
Primary key: partition key = customer_id, sort key = order_date. This supports "get all orders for customer X" efficiently. Create a GSI with partition key = order_status to support "get all pending orders" queries.
RDS vs DynamoDB Decision
| Factor | Choose RDS | Choose DynamoDB |
|---|---|---|
| Data model | Complex relationships, joins | Simple access patterns |
| Query patterns | Ad-hoc, complex queries | Known, limited patterns |
| Scale | Vertical (bigger instances) | Horizontal (unlimited) |
| Consistency | Strong (ACID) | Eventually consistent (default) |
| Schema | Fixed schema | Flexible schema |
Common Interview Questions
Architecture Scenarios
Q: Design a highly available web application on AWS.
A:
- Multi-AZ deployment across at least 2 AZs
- Application Load Balancer distributing traffic
- Auto Scaling Group for EC2 instances
- RDS Multi-AZ for database
- S3 for static assets, CloudFront for CDN
- Private subnets for app/database, public for ALB
- Security Groups limiting access between tiers
Q: How would you reduce costs for a development environment?
A:
- Use smaller instance types (t3.micro, t3.small)
- Schedule instances to stop outside business hours (Lambda + CloudWatch Events)
- Use Spot Instances for non-critical workloads
- Single-AZ RDS (availability less critical in dev)
- Delete unused EBS volumes and snapshots
- Review and right-size based on CloudWatch metrics
Q: Your application needs to process files uploaded to S3. Design this.
A:
- S3 bucket with event notification
- S3 triggers Lambda on object creation
- Lambda processes the file (or sends to SQS for larger jobs)
- For large files, Lambda sends to SQS, EC2/ECS workers process
- Results stored in S3 or database
- Dead-letter queue for failed processing
- CloudWatch alarms for failures
Troubleshooting Questions
Q: EC2 instance can't reach the internet. What do you check?
A:
- Is it in a public or private subnet?
- Public subnet: Does it have a public IP? Is there a route to IGW?
- Private subnet: Is there a NAT Gateway? Route to NAT?
- Security Group: Outbound rules allow the traffic?
- NACL: Allow outbound and inbound for return traffic?
- Is the IGW/NAT Gateway actually created and attached?
Q: Lambda function times out when accessing RDS. Why?
A:
- Lambda in VPC? If yes, needs NAT Gateway for internet (unless using VPC endpoints)
- Security Group on RDS allows traffic from Lambda's security group?
- Lambda configured with same VPC and subnets that can reach RDS?
- RDS in private subnet, Lambda also in private subnet (or has path)?
- Cold start + connection establishment exceeding timeout?
Q: S3 bucket policy allows access but requests are denied. Why?
A:
- S3 Block Public Access enabled at bucket or account level?
- IAM policy on the user/role explicitly denying?
- Permission boundary restricting access?
- VPC endpoint policy restricting access?
- Bucket policy condition not met (IP, VPC, MFA)?
Quick Reference
Compute:
- EC2: Virtual servers, full control, many instance types
- Lambda: Serverless, event-driven, 15-minute max, auto-scaling
Storage:
- S3: Object storage, unlimited, multiple storage classes
- EBS: Block storage for EC2, single AZ, snapshots to S3
- EFS: Shared file storage, multiple EC2s, regional
Networking:
- VPC: Your isolated network
- Subnet: Segment in single AZ, public or private
- Security Group: Stateful, instance-level, allow rules only
- NACL: Stateless, subnet-level, allow and deny rules
Database:
- RDS: Managed relational, Multi-AZ for HA, Read Replicas for scaling
- DynamoDB: NoSQL, single-digit ms latency, auto-scaling
IAM:
- Users/Groups: Human identities
- Roles: Service/application identities, temporary credentials
- Policies: JSON permission documents
- Always use least privilege
Related Articles
If you found this helpful, explore our other DevOps guides:
- Complete DevOps Engineer Interview Guide - Full DevOps interview preparation
- Docker Interview Guide - Container fundamentals
- Kubernetes Interview Guide - Container orchestration on EKS
- Linux Commands Interview Guide - Essential Linux skills
- Monitoring & Observability Interview Guide - CloudWatch and beyond
What's Next?
AWS interviews reward depth over breadth. It's better to deeply understand EC2, S3, VPC, and IAM than to superficially know 50 services.
Start with the services covered here—they appear in virtually every AWS interview. Get hands-on experience: create a VPC from scratch, deploy an application with an ALB and Auto Scaling Group, set up proper IAM roles. The console is fine for learning, but understand what you're creating.
Once you're confident with the fundamentals, expand to services relevant to your role: ECS/EKS for containers, CloudFormation/CDK for infrastructure as code, or specialized services for your domain.
