What happens when you type a URL in a browser?

DNS resolution finds the IP address (checking browser cache, OS cache, resolver, root servers). TCP connection established via three-way handshake. TLS handshake for HTTPS (certificate verification, key exchange). HTTP request sent. Server processes and returns response. Browser renders HTML, fetching additional resources (CSS, JS, images) as needed. Each resource may require new connections or reuse existing ones with keep-alive.

How do you troubleshoot a connection timeout?

Start with ping to test basic connectivity. Use traceroute to identify where packets stop. Check DNS resolution with dig or nslookup. Verify the port is open with telnet or nc. Check firewall rules on both ends. Use tcpdump to see if packets arrive. Check application logs for errors. Common causes: firewall blocking, DNS failure, service not running, wrong port, network routing issues.

What is CIDR notation and how does subnetting work?

CIDR notation like 10.0.0.0/24 indicates network address and subnet mask. The /24 means 24 bits for network, 8 bits for hosts (256 addresses, 254 usable). /16 gives 65,536 addresses, /32 is a single host. Subnetting divides networks into smaller segments for security, organization, and efficient IP usage. In cloud environments, you typically use /16 for VPCs and /24 for subnets.

What DNS record types should developers know?

A records map domain to IPv4 address. AAAA maps to IPv6. CNAME creates aliases pointing to another domain (can't be used at apex). MX specifies mail servers with priority. TXT holds arbitrary text (used for SPF, DKIM, domain verification). NS delegates to nameservers. PTR provides reverse DNS lookup. Understanding TTL (time-to-live) is crucial - low TTL for flexibility, high TTL for performance.

50+ Networking Interview Questions 2025: TCP/IP, DNS & Load Balancing

Q: What is the difference between TCP and UDP?

TCP is connection-oriented with guaranteed delivery, ordering, and error checking - used for HTTP, SSH, databases. UDP is connectionless with no delivery guarantees but lower latency - used for DNS queries, video streaming, gaming. TCP has a three-way handshake (SYN, SYN-ACK, ACK) while UDP just sends packets. Choose TCP when reliability matters, UDP when speed matters and you can handle packet loss.

Q: What is the difference between Layer 4 and Layer 7 load balancing?

Layer 4 (transport) load balancers route based on IP and port - fast but no content inspection. Layer 7 (application) load balancers inspect HTTP headers, URLs, and cookies - can route based on path, host, or content. Use L4 for raw TCP/UDP traffic or maximum performance. Use L7 when you need path-based routing, SSL termination, or request manipulation.

Networking knowledge separates good DevOps engineers from great ones. When production goes down at 3 AM, you need to know whether the problem is DNS, a firewall rule, or a failing load balancer—and you need to know fast.

This guide covers the networking fundamentals that come up in DevOps, SRE, and backend interviews. Not certification-level theory, but practical knowledge for debugging real systems.

Networking Fundamentals Questions
DNS Questions
HTTP and HTTPS Questions
Load Balancing Questions
Firewall and Security Questions
Network Troubleshooting Questions
Classic Interview Scenario Questions
Quick Reference

Networking Fundamentals Questions

Understanding the core networking concepts is essential for any DevOps or backend engineer interview.

What is the OSI model and which layers matter most for troubleshooting?

The OSI model is a conceptual framework that describes how data moves through a network in seven layers. While you don't need to memorize all seven layers for most interviews, understanding the practical layers helps you systematically debug network issues.

When troubleshooting, you typically work up the stack: if you can't ping, it's a Layer 3 issue; if you can ping but can't connect to a port, it's Layer 4; if the connection works but the app fails, it's Layer 7.

Layer 7 - Application    HTTP, DNS, SSH (what your app speaks)
Layer 4 - Transport      TCP, UDP (how data gets delivered)
Layer 3 - Network        IP, routing (where data goes)
Layer 2 - Data Link      MAC addresses, switches (local network)
Layer 1 - Physical       Cables, signals (hardware)

What is the difference between TCP and UDP?

TCP (Transmission Control Protocol) and UDP (User Datagram Protocol) are the two main transport layer protocols, each designed for different use cases. TCP prioritizes reliability, while UDP prioritizes speed.

TCP is connection-oriented, meaning it establishes a connection before sending data using a three-way handshake. It guarantees delivery through acknowledgments, ensures packets arrive in order, and implements flow control and congestion control. This makes TCP ideal for HTTP, HTTPS, SSH, FTP, SMTP, and database connections.

UDP is connectionless—it simply sends packets without establishing a connection first. There's no delivery guarantee or ordering guarantee, but this results in lower latency and less overhead. UDP is used for DNS queries, video streaming, gaming, and VoIP where occasional packet loss is acceptable.

sequenceDiagram
    participant C as Client
    participant S as Server
    C->>S: SYN (I want to connect)
    S->>C: SYN-ACK (OK, I acknowledge)
    C->>S: ACK (Great, let's talk)
    C<-->S: DATA (Connection established)

When would you choose UDP over TCP?

You would choose UDP over TCP when latency matters more than reliability. The key insight is that in real-time applications, a retransmitted packet arriving late is often useless.

Video streaming can skip frames—if a packet is lost, showing the next frame is better than waiting for retransmission. DNS queries are small and can simply retry if the response doesn't arrive. Gaming needs real-time updates where old positional data is worthless by the time it arrives. VoIP works similarly—slightly degraded audio quality is better than choppy delayed audio from retransmissions.

How do IP addressing and subnetting work?

IP addressing and subnetting are fundamental to network design and troubleshooting. IPv4 addresses consist of four octets (0-255 each), like 192.168.1.100. Subnetting allows you to divide networks into smaller, more manageable segments.

Private IP ranges defined in RFC 1918 are not routable on the internet and can be used freely within your network. The 10.0.0.0/8 range provides 16 million addresses for large networks. The 172.16.0.0/12 range offers 1 million addresses for medium networks. The 192.168.0.0/16 range provides 65,536 addresses for home and small networks.

CIDR notation combines the network address with a prefix length indicating how many bits are used for the network portion:

10.0.0.0/8    = 10.0.0.0 - 10.255.255.255   (16,777,216 IPs)
10.0.0.0/16   = 10.0.0.0 - 10.0.255.255     (65,536 IPs)
10.0.0.0/24   = 10.0.0.0 - 10.0.0.255       (256 IPs)
10.0.0.0/32   = 10.0.0.0                     (1 IP - single host)

Subnet math shortcut: Each decrease in prefix length doubles the addresses. A /24 gives 256 addresses (2^8), /25 gives 128 (2^7), /26 gives 64 (2^6). Remember to subtract 2 for network and broadcast addresses.

How would you design IP addressing for a VPC with three subnets?

When designing IP addressing for a cloud VPC, you need to balance having enough addresses for growth while keeping subnets logically organized. A common pattern is to use a /16 for the VPC and /24 for individual subnets.

This approach gives you 256 possible subnets within your VPC, each with 254 usable IP addresses—plenty of room to grow while maintaining clear organization.

VPC: 10.0.0.0/16 (65,536 addresses total)

Subnets:
- Public:  10.0.1.0/24  (web servers, load balancers)
- Private: 10.0.2.0/24  (application servers)
- Data:    10.0.3.0/24  (databases)

Each subnet has 254 usable IPs, plenty of room to grow.

What ports should every developer know?

Understanding common ports helps you quickly diagnose connection issues and configure firewalls. These ports are standardized by IANA and used consistently across systems.

Port	Service	Protocol
22	SSH	TCP
80	HTTP	TCP
443	HTTPS	TCP
53	DNS	UDP/TCP
25	SMTP	TCP
3306	MySQL	TCP
5432	PostgreSQL	TCP
6379	Redis	TCP
27017	MongoDB	TCP

DNS Questions

DNS (Domain Name System) translates human-readable domain names to IP addresses. It's involved in almost every network issue you'll debug.

How does DNS resolution work step by step?

DNS resolution is a hierarchical lookup process that converts domain names to IP addresses. Understanding each step helps you debug DNS issues and optimize DNS performance.

When you request a domain, your system checks multiple caches before making network requests. If no cache has the answer, a recursive lookup begins, traveling from root servers down to authoritative nameservers.

1. Browser cache      → Already know google.com? Use cached IP
2. OS cache           → Check /etc/hosts and system DNS cache
3. Resolver           → Ask configured DNS server (ISP, 8.8.8.8, etc.)
4. Root servers       → "Who handles .com?"
5. TLD servers        → "Who handles google.com?"
6. Authoritative NS   → "google.com is 142.250.x.x"
7. Cache the result   → Store for TTL duration

Recursive vs Iterative resolution: In recursive resolution, the resolver does all the work and returns the final answer. In iterative resolution, each server says "I don't know, ask them" and the client must make subsequent queries.

What are the main DNS record types and when do you use each?

DNS records serve different purposes, and choosing the right record type is essential for proper domain configuration. Each record type stores specific information about how to handle requests for a domain.

Type	Purpose	Example
A	IPv4 address	`example.com → 93.184.216.34`
AAAA	IPv6 address	`example.com → 2606:2800:220:1:...`
CNAME	Alias to another name	`www.example.com → example.com`
MX	Mail server (with priority)	`example.com → mail.example.com (10)`
TXT	Arbitrary text	SPF, DKIM, domain verification
NS	Nameserver delegation	`example.com → ns1.example.com`
PTR	Reverse lookup (IP → name)	`34.216.184.93 → example.com`
SOA	Start of Authority	Zone metadata, serial numbers

Important CNAME restriction: CNAME records cannot be used at the zone apex (root domain). You can use www.example.com as a CNAME, but example.com itself cannot be a CNAME—use an ALIAS record or A record instead.

What is TTL and how do you choose the right value?

TTL (Time To Live) determines how long DNS records are cached by resolvers and clients. Choosing the right TTL involves balancing between quick propagation and reduced DNS query load.

Low TTL (60-300 seconds) enables quick propagation when you make changes and easier failover during incidents. However, it results in more DNS queries and slightly higher latency for users.

High TTL (3600-86400 seconds) means fewer DNS queries and better performance since responses are cached longer. The downside is slow propagation when you need to make changes and difficulty switching quickly during emergencies.

Best practice: Lower your TTL before planned changes (like migrations), then raise it back afterward. This gives you flexibility when you need it while maintaining performance during normal operation.

How do you troubleshoot DNS issues?

DNS troubleshooting requires systematic investigation using command-line tools. These commands help you identify whether DNS is the root cause of connectivity problems.

# Basic lookup
dig example.com
nslookup example.com
 
# Query specific record type
dig example.com MX
dig example.com TXT
 
# Query specific nameserver
dig @8.8.8.8 example.com
 
# Trace the full resolution path
dig +trace example.com
 
# Check TTL remaining
dig example.com | grep -E "^example"
# example.com.    234    IN    A    93.184.216.34
#                 ^^^-- seconds until cache expires
 
# Reverse lookup
dig -x 93.184.216.34

Common DNS error codes:

NXDOMAIN: Domain doesn't exist (check spelling or registration)
SERVFAIL: Nameserver error (problem with authoritative servers)
Timeout: Network issue or DNS server down
Wrong IP: Stale cache or misconfiguration

HTTP and HTTPS Questions

Understanding HTTP is essential for debugging web applications and APIs.

What are the HTTP methods and what makes them idempotent or safe?

HTTP methods define the intended action for a request. Understanding idempotence and safety helps you design APIs correctly and debug unexpected behavior.

Idempotent means multiple identical requests have the same effect as a single request. Safe means the request doesn't modify server state. These properties affect how clients can retry requests and how proxies can cache responses.

Method	Purpose	Idempotent	Safe
GET	Retrieve resource	Yes	Yes
POST	Create resource	No	No
PUT	Replace resource	Yes	No
PATCH	Partial update	No	No
DELETE	Remove resource	Yes	No
HEAD	GET without body	Yes	Yes
OPTIONS	Get allowed methods	Yes	Yes

What HTTP status codes should you know and what do they mean?

HTTP status codes communicate the result of a request. Knowing the common codes helps you quickly understand what's happening when debugging issues.

1xx - Informational (100 Continue, 101 Switching Protocols)
2xx - Success (200 OK, 201 Created, 204 No Content)
3xx - Redirection (301 Permanent, 302 Found, 304 Not Modified)
4xx - Client Error (400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found)
5xx - Server Error (500 Internal Error, 502 Bad Gateway, 503 Service Unavailable)

Key distinctions:

401 vs 403: 401 means not authenticated (who are you?), 403 means authenticated but not authorized (you can't do that)
502 vs 503 vs 504: 502 means bad response from upstream server, 503 means server is overloaded or down for maintenance, 504 means upstream server timed out

How does the TLS/SSL handshake work?

TLS (Transport Layer Security) encrypts communication between client and server. Understanding the handshake helps you debug certificate issues and performance problems.

The handshake establishes a secure connection by verifying the server's identity and generating shared encryption keys. This all happens before any application data is exchanged.

sequenceDiagram
    participant C as Client
    participant S as Server
    C->>S: Client Hello (cipher suites, TLS version)
    S->>C: Server Hello (chosen cipher, certificate)
    Note over C: Verify certificate chain
    C->>S: Key Exchange (generate session keys)
    S->>C: Key Exchange
    C<-->S: Encrypted Traffic

Certificate chain verification:

Server certificate (your domain)
Intermediate certificate(s)
Root certificate (trusted by browsers)

Common TLS issues and how to debug them:

Certificate expired: Check the notAfter date
Name mismatch: Certificate doesn't match the domain you're connecting to
Incomplete chain: Missing intermediate certificate
Self-signed: Not trusted by default (need to add to trust store)

# Check certificate
openssl s_client -connect example.com:443 -servername example.com
 
# Check expiration
echo | openssl s_client -connect example.com:443 2>/dev/null | openssl x509 -noout -dates

What are the differences between HTTP/1.1, HTTP/2, and HTTP/3?

HTTP has evolved significantly to address performance limitations. Each version introduces improvements in how data is transmitted between clients and servers.

HTTP/1.1 limitations: Only one request per connection (head-of-line blocking), text-based headers (inefficient), and no server push capability.

HTTP/2 improvements: Multiplexing allows multiple requests over a single connection. Header compression (HPACK) reduces overhead. Server push lets servers send resources proactively. The binary protocol is more efficient to parse.

HTTP/3 improvements: Uses QUIC protocol built on UDP instead of TCP. Eliminates head-of-line blocking at the transport layer. Offers faster connection establishment. Provides better mobile performance through connection migration when switching networks.

Load Balancing Questions

Load balancers distribute traffic across multiple servers to improve availability and performance.

What is the difference between Layer 4 and Layer 7 load balancing?

Layer 4 and Layer 7 load balancing operate at different levels of the network stack and offer different capabilities. Your choice depends on what information you need to make routing decisions.

Layer 4 (Transport) load balancers route based on IP address and port without inspecting the content. They're faster and less CPU intensive but can only make routing decisions based on network-level information. Use them for TCP/UDP passthrough or non-HTTP protocols.

Layer 7 (Application) load balancers inspect HTTP headers, URLs, and cookies. They can route based on content, perform SSL termination, and modify requests. Use them for HTTP routing, path-based routing, and A/B testing.

flowchart LR
    subgraph L4["Layer 4 Load Balancer"]
        C1["Client"] --> L4LB["L4 LB<br/>(routes by IP:port)"]
        L4LB --> S1["Server"]
    end
 
    subgraph L7["Layer 7 Load Balancer"]
        C2["Client"] --> L7LB["L7 LB"]
        L7LB -->|"/api/*"| API["API Servers"]
        L7LB -->|"/static/*"| CDN["CDN"]
    end

What load balancing algorithms exist and when do you use each?

Load balancing algorithms determine how traffic is distributed across backend servers. The right choice depends on your server capacities and request characteristics.

Algorithm	How It Works	Best For
Round Robin	Rotate through servers sequentially	Equal capacity servers
Weighted Round Robin	Rotate with weights	Mixed capacity servers
Least Connections	Send to server with fewest connections	Varying request duration
IP Hash	Hash client IP to choose server	Session affinity
Least Response Time	Send to fastest responding server	Performance optimization
Random	Random selection	Simple, surprisingly effective

How would slow responses be affected by load balancing algorithm choice?

If users report slow responses and you're using round robin with servers of different capacities, slower servers receive equal traffic and become bottlenecks. The solution is to switch to least connections or weighted round robin.

If requests have varying durations (some quick, some slow), round robin can cause queue buildup on servers that happen to receive multiple slow requests. Least connections prevents this by always sending to the server with the most available capacity.

How do health checks work in load balancing?

Health checks allow load balancers to detect unhealthy backends and stop sending traffic to them. Without proper health checks, users may be routed to failed servers.

Health check types:

TCP Check: Can we connect to the port? Fast but basic.
HTTP Check: Does GET /health return 200? Application-aware.
Custom Check: Does /health return {"status": "ok", "db": "connected"}? Deep verification.

Health check parameters:

Interval: How often to check (e.g., 10 seconds)
Timeout: How long to wait for response (e.g., 5 seconds)
Threshold: How many failures before marking unhealthy (e.g., 3)
Recovery: How many successes before marking healthy (e.g., 2)

What are sticky sessions and what are the trade-offs?

Sticky sessions (session affinity) keep a user connected to the same backend server throughout their session. This simplifies applications that store session state locally but introduces operational challenges.

Methods for implementing sticky sessions:

Cookie-based: Load balancer sets a cookie with server ID
IP-based: Hash client IP (problems with NAT and mobile users)
Application-based: App sets session cookie, load balancer reads it

Trade-offs:

Pros	Cons
Session state stays on one server	Uneven load distribution
Simpler application code	Server failure loses sessions
Better cache hit rates	Harder to scale down

Better alternative: Externalize session state to Redis or a database. This allows any server to handle any request while maintaining session continuity.

Firewall and Security Questions

Understanding firewalls and network security is essential for secure infrastructure design.

How do firewall rules work?

Firewalls filter network traffic based on rules evaluated in priority order. Understanding rule structure and evaluation order is critical for both security and troubleshooting.

Rules typically specify priority, action (allow/deny), protocol, source address, destination address, and port.

Rule Structure:
[Priority] [Action] [Protocol] [Source] [Destination] [Port]

Example rules:
1. ALLOW  TCP  10.0.0.0/8    any         22     # SSH from internal
2. ALLOW  TCP  any           any         443    # HTTPS from anywhere
3. ALLOW  TCP  any           any         80     # HTTP from anywhere
4. DENY   any  any           any         any    # Default deny

What is the difference between stateful and stateless firewalls?

Stateful and stateless firewalls differ in how they track connections. This affects both configuration complexity and resource usage.

Stateful	Stateless
Tracks connections	No connection tracking
Return traffic automatic	Need explicit return rules
More memory usage	Less resource intensive
Easier to configure	More rules needed
Security groups (AWS)	NACLs (AWS)

Stateful firewalls remember that an outbound connection was made and automatically allow the return traffic. Stateless firewalls require you to explicitly allow traffic in both directions.

How should you segment networks for security?

Network segmentation divides your infrastructure into security zones, limiting the blast radius of a breach and controlling traffic flow between components.

The principle is defense in depth—multiple layers of security that an attacker must breach. Even if someone compromises your web servers, they shouldn't automatically have access to your databases.

flowchart TB
    Internet["Internet"]
    Internet --> LB["Load Balancer<br/>(Public)"]
 
    subgraph DMZ["DMZ Zone"]
        Web["Web Servers"]
    end
 
    subgraph Private["Private Zone"]
        App["App Servers"]
    end
 
    subgraph Data["Data Zone"]
        DB["Databases"]
    end
 
    LB --> Web
    Web -->|"Firewall"| App
    App -->|"Firewall"| DB

What is NAT and how does it work?

NAT (Network Address Translation) translates private IP addresses to public IP addresses. It allows multiple devices to share a single public IP and provides a layer of security by hiding internal network structure.

Types of NAT:

SNAT (Source NAT): Changes source IP for outbound traffic
DNAT (Destination NAT): Changes destination IP for inbound traffic
PAT (Port Address Translation): Many private IPs share one public IP using different ports

flowchart LR
    Private["Private<br/>10.0.1.50"] -->|"source IP<br/>changed"| NAT["NAT Gateway<br/>203.0.113.5:12345"]
    NAT -->|"translated to<br/>public IP + port"| Internet["example.com"]

NAT Gateway/Instance: Allows private subnets to access the internet without being directly accessible from outside, providing both connectivity and security.

Network Troubleshooting Questions

Systematic debugging is essential for resolving network issues quickly.

What tools do you use for connectivity testing?

Connectivity testing tools help you identify where in the network path a problem occurs. Starting with basic connectivity and working up to application-level tests is the systematic approach.

# Basic connectivity
ping example.com
ping -c 4 example.com          # Stop after 4 pings
 
# Trace route to destination
traceroute example.com         # Linux/Mac
tracert example.com            # Windows
mtr example.com                # Better traceroute (continuous)
 
# Test specific port
telnet example.com 80
nc -zv example.com 80          # Netcat
nc -zv example.com 20-25       # Port range

How do you check what ports are in use on a system?

Knowing what ports are in use and which processes own them is essential for debugging services that won't start or for security auditing.

# Show listening ports
netstat -tulpn                 # Linux
netstat -an | grep LISTEN      # Mac
 
# Modern alternative to netstat
ss -tulpn                      # Show listening ports
ss -s                          # Socket statistics
 
# What's using a port?
lsof -i :80                    # What process has port 80
fuser 80/tcp                   # Alternative

How do you capture and analyze network packets?

Packet capture is the ultimate debugging tool—it shows you exactly what's happening on the wire. Use it when higher-level tools don't reveal the problem.

# Capture packets
tcpdump -i eth0                        # All traffic on interface
tcpdump -i eth0 port 80                # Only port 80
tcpdump -i eth0 host 10.0.1.50         # Only specific host
tcpdump -i eth0 -w capture.pcap        # Save to file
 
# Read capture file
tcpdump -r capture.pcap
wireshark capture.pcap                  # GUI analysis
 
# Useful filters
tcpdump 'tcp[tcpflags] & (tcp-syn) != 0'  # Only SYN packets
tcpdump -A port 80                         # Show ASCII content

What curl commands are essential for HTTP debugging?

curl is your Swiss Army knife for debugging HTTP issues. Knowing these commands helps you quickly isolate whether problems are with DNS, TLS, or the application.

# Basic request
curl https://example.com
 
# Show headers
curl -I https://example.com             # HEAD request (headers only)
curl -i https://example.com             # Include headers in output
 
# Verbose output (see handshake)
curl -v https://example.com
 
# Follow redirects
curl -L https://example.com
 
# Custom headers
curl -H "Authorization: Bearer token" https://api.example.com
 
# POST with data
curl -X POST -d '{"key":"value"}' -H "Content-Type: application/json" https://api.example.com
 
# Time the request
curl -w "@curl-format.txt" -o /dev/null -s https://example.com
 
# curl-format.txt:
#     time_namelookup:  %{time_namelookup}s\n
#        time_connect:  %{time_connect}s\n
#     time_appconnect:  %{time_appconnect}s\n
#        time_total:    %{time_total}s\n

Classic Interview Scenario Questions

These scenarios test your end-to-end understanding of networking concepts.

What happens when you type google.com in a browser?

This classic question tests comprehensive understanding of web request lifecycle. Interviewers want to see that you understand each layer and can explain how they connect.

1. URL Parsing: Browser extracts protocol (https), hostname (google.com), path (/)

2. DNS Resolution:

Check browser cache
Check OS cache
Query DNS resolver
Recursive lookup through root → TLD → authoritative
Cache result based on TTL

3. TCP Connection:

Three-way handshake (SYN, SYN-ACK, ACK)
Connection to IP on port 443

4. TLS Handshake:

Client Hello (supported ciphers)
Server Hello (chosen cipher, certificate)
Certificate verification
Key exchange
Encrypted channel established

5. HTTP Request:

GET / HTTP/2
Headers (Host, User-Agent, Accept, etc.)

6. Server Processing:

Load balancer routes request
Web server processes
Backend calls if needed
Response generated

7. Response:

Status code (200 OK)
Headers (Content-Type, Cache-Control)
Body (HTML)

8. Rendering:

Parse HTML
Fetch CSS, JS, images (parallel requests)
Build DOM and CSSOM
Execute JavaScript
Paint to screen

How do you systematically debug connectivity issues?

A systematic approach prevents you from chasing red herrings. Work through the network stack layer by layer until you find where it breaks.

# 1. Can we resolve the hostname?
dig api.example.com
# If NXDOMAIN → DNS issue
 
# 2. Can we reach the IP?
ping 93.184.216.34
# If timeout → routing/firewall issue
 
# 3. Can we reach the port?
nc -zv 93.184.216.34 443
# If refused → service not running or firewall
 
# 4. Is TLS working?
openssl s_client -connect api.example.com:443
# If handshake fails → certificate issue
 
# 5. Does HTTP work?
curl -v https://api.example.com/health
# If error → application issue

How do you design for high availability?

High availability requires eliminating single points of failure at every layer. Interviewers want to see that you understand redundancy, health checking, and graceful degradation.

Key patterns:

Multiple availability zones: Servers in different data centers
Load balancer with health checks: Automatically remove failed instances
DNS failover: Route53 health checks, multiple A records
Connection draining: Graceful shutdown for existing connections
Retry with backoff: Client-side resilience

flowchart TB
    R53["Route53<br/>(DNS with health checks)"]
    ALB["ALB<br/>(Cross-zone load balancing)"]
 
    R53 --> ALB
 
    subgraph AZ1["AZ-1"]
        App1["App"]
    end
 
    subgraph AZ2["AZ-2"]
        App2["App"]
    end
 
    subgraph AZ3["AZ-3"]
        App3["App"]
    end
 
    ALB --> App1
    ALB --> App2
    ALB --> App3

Quick Reference

What are the essential networking commands?

These commands cover the most common debugging scenarios. Memorizing them allows you to quickly diagnose issues.

Task	Command
DNS lookup	`dig example.com`
Trace route	`mtr example.com`
Test port	`nc -zv host port`
Show connections	`ss -tulpn`
Capture packets	`tcpdump -i eth0`
HTTP request	`curl -v https://example.com`
Check certificate	`openssl s_client -connect host:443`

What is the quick reference for common ports?

22   SSH         443  HTTPS       6379  Redis
80   HTTP        3306 MySQL       27017 MongoDB
53   DNS         5432 PostgreSQL  9200  Elasticsearch
25   SMTP        8080 Alt HTTP    2379  etcd

What is your troubleshooting checklist?

□ DNS resolving correctly?
□ IP reachable (ping)?
□ Port open (nc/telnet)?
□ Firewall rules allow traffic?
□ Service running on target?
□ TLS certificate valid?
□ Application responding?
□ Correct response code?

Linux Commands Interview Guide - Command-line fundamentals
Docker Interview Guide - Container networking
Kubernetes Interview Guide - Service networking, ingress
AWS Interview Guide - VPCs, security groups, ELB
System Design Interview Guide - Designing distributed systems

Table of Contents