Networking Interview Guide: TCP/IP, DNS, and Load Balancing Fundamentals

·15 min read
devopsnetworkingtcp-ipdnsload-balancinginterview-preparation

Networking knowledge separates good DevOps engineers from great ones. When production goes down at 3 AM, you need to know whether the problem is DNS, a firewall rule, or a failing load balancer—and you need to know fast.

This guide covers the networking fundamentals that come up in DevOps, SRE, and backend interviews. Not certification-level theory, but practical knowledge for debugging real systems.

Networking Fundamentals

The OSI Model (Practical View)

You don't need to memorize all seven layers. Focus on the ones that matter for troubleshooting:

Layer 7 - Application    HTTP, DNS, SSH (what your app speaks)
Layer 4 - Transport      TCP, UDP (how data gets delivered)
Layer 3 - Network        IP, routing (where data goes)
Layer 2 - Data Link      MAC addresses, switches (local network)
Layer 1 - Physical       Cables, signals (hardware)

Why it matters: When debugging, you work up the stack. Can't ping? Layer 3 issue. Can ping but can't connect to port? Layer 4. Connection works but app fails? Layer 7.

TCP vs UDP

The most common protocol question in interviews.

TCP (Transmission Control Protocol):

  • Connection-oriented (three-way handshake)
  • Guaranteed delivery with acknowledgments
  • Ordered packets
  • Flow control and congestion control
  • Used for: HTTP, HTTPS, SSH, FTP, SMTP, databases

UDP (User Datagram Protocol):

  • Connectionless (fire and forget)
  • No delivery guarantee
  • No ordering guarantee
  • Lower latency, less overhead
  • Used for: DNS queries, video streaming, gaming, VoIP
TCP Three-Way Handshake:

Client              Server
   |---- SYN ---->|     "I want to connect"
   |<-- SYN-ACK --|     "OK, I acknowledge"
   |---- ACK ---->|     "Great, let's talk"
   |              |
   |-- DATA <---> |     Connection established

Example question: "When would you choose UDP over TCP?"

When latency matters more than reliability. Video streaming can skip frames—a retransmitted packet arriving late is useless. DNS queries are small and can simply retry. Gaming needs real-time updates where old data is worthless.

IP Addressing and Subnets

IPv4 addresses: Four octets (0-255 each), like 192.168.1.100

Private IP ranges (RFC 1918):

  • 10.0.0.0/8 - 16 million addresses (large networks)
  • 172.16.0.0/12 - 1 million addresses (medium networks)
  • 192.168.0.0/16 - 65,536 addresses (home/small networks)

CIDR notation: Network address + prefix length

10.0.0.0/8    = 10.0.0.0 - 10.255.255.255   (16,777,216 IPs)
10.0.0.0/16   = 10.0.0.0 - 10.0.255.255     (65,536 IPs)
10.0.0.0/24   = 10.0.0.0 - 10.0.0.255       (256 IPs)
10.0.0.0/32   = 10.0.0.0                     (1 IP - single host)

Subnet math shortcut:

  • /24 = 256 addresses (2^8)
  • /25 = 128 addresses (2^7)
  • /26 = 64 addresses (2^6)
  • Subtract 2 for network and broadcast addresses

Example question: "Design IP addressing for a VPC with three subnets."

VPC: 10.0.0.0/16 (65,536 addresses total)

Subnets:
- Public:  10.0.1.0/24  (web servers, load balancers)
- Private: 10.0.2.0/24  (application servers)
- Data:    10.0.3.0/24  (databases)

Each subnet has 254 usable IPs, plenty of room to grow.

Common Ports

Know these by heart:

PortServiceProtocol
22SSHTCP
80HTTPTCP
443HTTPSTCP
53DNSUDP/TCP
25SMTPTCP
3306MySQLTCP
5432PostgreSQLTCP
6379RedisTCP
27017MongoDBTCP

DNS Deep Dive

DNS translates human-readable names to IP addresses. It's involved in almost every network issue.

How DNS Resolution Works

1. Browser cache      → Already know google.com? Use cached IP
2. OS cache           → Check /etc/hosts and system DNS cache
3. Resolver           → Ask configured DNS server (ISP, 8.8.8.8, etc.)
4. Root servers       → "Who handles .com?"
5. TLD servers        → "Who handles google.com?"
6. Authoritative NS   → "google.com is 142.250.x.x"
7. Cache the result   → Store for TTL duration

Recursive vs Iterative:

  • Recursive: Resolver does all the work, returns final answer
  • Iterative: Each server says "I don't know, ask them"

DNS Record Types

TypePurposeExample
AIPv4 addressexample.com → 93.184.216.34
AAAAIPv6 addressexample.com → 2606:2800:220:1:...
CNAMEAlias to another namewww.example.com → example.com
MXMail server (with priority)example.com → mail.example.com (10)
TXTArbitrary textSPF, DKIM, domain verification
NSNameserver delegationexample.com → ns1.example.com
PTRReverse lookup (IP → name)34.216.184.93 → example.com
SOAStart of AuthorityZone metadata, serial numbers

CNAME restrictions:

  • Cannot be used at zone apex (root domain)
  • www.example.com → CNAME OK
  • example.com → CNAME NOT OK (use ALIAS or A record)

TTL (Time To Live)

How long DNS records are cached.

Low TTL (60-300 seconds):
+ Quick propagation for changes
+ Easier failover
- More DNS queries
- Higher latency

High TTL (3600-86400 seconds):
+ Fewer DNS queries
+ Better performance
- Slow propagation
- Harder to change quickly

Best practice: Use low TTL before planned changes, raise it after.

DNS Troubleshooting

# Basic lookup
dig example.com
nslookup example.com
 
# Query specific record type
dig example.com MX
dig example.com TXT
 
# Query specific nameserver
dig @8.8.8.8 example.com
 
# Trace the full resolution path
dig +trace example.com
 
# Check TTL remaining
dig example.com | grep -E "^example"
# example.com.    234    IN    A    93.184.216.34
#                 ^^^-- seconds until cache expires
 
# Reverse lookup
dig -x 93.184.216.34

Common DNS issues:

  • NXDOMAIN: Domain doesn't exist
  • SERVFAIL: Nameserver error
  • Timeout: Network issue or server down
  • Wrong IP: Stale cache or misconfiguration

HTTP & HTTPS

HTTP Methods

MethodPurposeIdempotentSafe
GETRetrieve resourceYesYes
POSTCreate resourceNoNo
PUTReplace resourceYesNo
PATCHPartial updateNoNo
DELETERemove resourceYesNo
HEADGET without bodyYesYes
OPTIONSGet allowed methodsYesYes

Idempotent: Multiple identical requests have same effect as one. Safe: Doesn't modify server state.

HTTP Status Codes

1xx - Informational (100 Continue, 101 Switching Protocols)
2xx - Success (200 OK, 201 Created, 204 No Content)
3xx - Redirection (301 Permanent, 302 Found, 304 Not Modified)
4xx - Client Error (400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found)
5xx - Server Error (500 Internal Error, 502 Bad Gateway, 503 Service Unavailable)

Know these well:

  • 401 vs 403: 401 = not authenticated, 403 = authenticated but not authorized
  • 502 vs 503 vs 504: 502 = bad response from upstream, 503 = server overloaded, 504 = upstream timeout

TLS/SSL Handshake

Client                          Server
   |                               |
   |------ Client Hello --------->|  Supported cipher suites, TLS version
   |<----- Server Hello ----------|  Chosen cipher, certificate
   |                               |
   |  [Verify certificate chain]   |
   |                               |
   |------ Key Exchange --------->|  Generate session keys
   |<----- Key Exchange ----------|
   |                               |
   |====== Encrypted Traffic =====|  All data now encrypted

Certificate chain:

  1. Server certificate (your domain)
  2. Intermediate certificate(s)
  3. Root certificate (trusted by browsers)

Common TLS issues:

  • Certificate expired: Check notAfter date
  • Name mismatch: Certificate doesn't match domain
  • Incomplete chain: Missing intermediate certificate
  • Self-signed: Not trusted by default
# Check certificate
openssl s_client -connect example.com:443 -servername example.com
 
# Check expiration
echo | openssl s_client -connect example.com:443 2>/dev/null | openssl x509 -noout -dates

HTTP/2 and HTTP/3

HTTP/1.1 limitations:

  • One request per connection (head-of-line blocking)
  • Text-based headers (inefficient)
  • No server push

HTTP/2 improvements:

  • Multiplexing (multiple requests over one connection)
  • Header compression (HPACK)
  • Server push
  • Binary protocol

HTTP/3 improvements:

  • QUIC protocol (UDP-based)
  • No head-of-line blocking at transport layer
  • Faster connection establishment
  • Better mobile performance (connection migration)

Load Balancing

Layer 4 vs Layer 7

Layer 4 (Transport):

  • Routes based on IP address and port
  • No content inspection
  • Faster, less CPU intensive
  • Use for: TCP/UDP passthrough, non-HTTP protocols

Layer 7 (Application):

  • Inspects HTTP headers, URLs, cookies
  • Can route based on content
  • SSL termination
  • Use for: HTTP routing, path-based routing, A/B testing
Layer 4 Load Balancer:
Client → [L4 LB] → Server
         (routes by IP:port)

Layer 7 Load Balancer:
Client → [L7 LB] → Server
         (routes by /api/* → api-servers)
         (routes by /static/* → cdn)

Load Balancing Algorithms

AlgorithmHow It WorksBest For
Round RobinRotate through servers sequentiallyEqual capacity servers
Weighted Round RobinRotate with weightsMixed capacity servers
Least ConnectionsSend to server with fewest connectionsVarying request duration
IP HashHash client IP to choose serverSession affinity
Least Response TimeSend to fastest responding serverPerformance optimization
RandomRandom selectionSimple, surprisingly effective

Example question: "Your users report slow responses. How might the load balancing algorithm affect this?"

If using round robin with servers of different capacities, slower servers get equal traffic and become bottlenecks. Switch to least connections or weighted round robin. If requests have varying durations, least connections prevents queue buildup on slow servers.

Health Checks

Load balancers need to know which backends are healthy.

Health Check Types:

TCP Check:
- Can we connect to port 80?
- Fast, basic

HTTP Check:
- Does GET /health return 200?
- Application-aware

Custom Check:
- Does /health return {"status": "ok", "db": "connected"}?
- Deep health verification

Health check parameters:

  • Interval: How often to check (e.g., 10 seconds)
  • Timeout: How long to wait for response (e.g., 5 seconds)
  • Threshold: How many failures before marking unhealthy (e.g., 3)
  • Recovery: How many successes before marking healthy (e.g., 2)

Sticky Sessions

Keep a user connected to the same backend server.

Methods:

  • Cookie-based: Load balancer sets a cookie with server ID
  • IP-based: Hash client IP (problems with NAT)
  • Application-based: App sets session cookie, LB reads it

Trade-offs:

Pros:
+ Session state stays on one server
+ Simpler application code
+ Better cache hit rates

Cons:
- Uneven load distribution
- Server failure loses sessions
- Harder to scale down

Better alternative: Externalize session state to Redis or database.


Firewalls & Security

Firewall Rules

Firewalls filter traffic based on rules evaluated in order.

Rule Structure:
[Priority] [Action] [Protocol] [Source] [Destination] [Port]

Example rules:
1. ALLOW  TCP  10.0.0.0/8    any         22     # SSH from internal
2. ALLOW  TCP  any           any         443    # HTTPS from anywhere
3. ALLOW  TCP  any           any         80     # HTTP from anywhere
4. DENY   any  any           any         any    # Default deny

Stateful vs Stateless:

StatefulStateless
Tracks connectionsNo connection tracking
Return traffic automaticNeed explicit return rules
More memory usageLess resource intensive
Easier to configureMore rules needed
Security groups (AWS)NACLs (AWS)

Network Segmentation

Divide networks into security zones:

┌─────────────────────────────────────────────┐
│                  Internet                    │
└──────────────────────┬──────────────────────┘
                       │
              ┌────────▼────────┐
              │  Load Balancer  │
              │   (Public)      │
              └────────┬────────┘
                       │
┌──────────────────────┼──────────────────────┐
│ DMZ                  │                       │
│              ┌───────▼───────┐              │
│              │  Web Servers  │              │
│              └───────┬───────┘              │
└──────────────────────┼──────────────────────┘
                       │ (Firewall)
┌──────────────────────┼──────────────────────┐
│ Private              │                       │
│              ┌───────▼───────┐              │
│              │  App Servers  │              │
│              └───────┬───────┘              │
└──────────────────────┼──────────────────────┘
                       │ (Firewall)
┌──────────────────────┼──────────────────────┐
│ Data                 │                       │
│              ┌───────▼───────┐              │
│              │   Databases   │              │
│              └───────────────┘              │
└─────────────────────────────────────────────┘

NAT (Network Address Translation)

Translates private IPs to public IPs.

Types:

  • SNAT (Source NAT): Change source IP (outbound traffic)
  • DNAT (Destination NAT): Change destination IP (inbound traffic)
  • PAT (Port Address Translation): Many private IPs share one public IP
NAT Example (outbound):

Private                NAT Gateway              Internet
10.0.1.50 ──────────> 203.0.113.5:12345 ──────────> example.com
         source IP     translated to
         changed       public IP + port

NAT Gateway/Instance: Allows private subnets to access internet without being directly accessible.


Troubleshooting Tools

Connectivity Testing

# Basic connectivity
ping example.com
ping -c 4 example.com          # Stop after 4 pings
 
# Trace route to destination
traceroute example.com         # Linux/Mac
tracert example.com            # Windows
mtr example.com                # Better traceroute (continuous)
 
# Test specific port
telnet example.com 80
nc -zv example.com 80          # Netcat
nc -zv example.com 20-25       # Port range

Network Statistics

# Show listening ports
netstat -tulpn                 # Linux
netstat -an | grep LISTEN      # Mac
 
# Modern alternative to netstat
ss -tulpn                      # Show listening ports
ss -s                          # Socket statistics
 
# What's using a port?
lsof -i :80                    # What process has port 80
fuser 80/tcp                   # Alternative

Packet Analysis

# Capture packets
tcpdump -i eth0                        # All traffic on interface
tcpdump -i eth0 port 80                # Only port 80
tcpdump -i eth0 host 10.0.1.50         # Only specific host
tcpdump -i eth0 -w capture.pcap        # Save to file
 
# Read capture file
tcpdump -r capture.pcap
wireshark capture.pcap                  # GUI analysis
 
# Useful filters
tcpdump 'tcp[tcpflags] & (tcp-syn) != 0'  # Only SYN packets
tcpdump -A port 80                         # Show ASCII content

HTTP Debugging with curl

# Basic request
curl https://example.com
 
# Show headers
curl -I https://example.com             # HEAD request (headers only)
curl -i https://example.com             # Include headers in output
 
# Verbose output (see handshake)
curl -v https://example.com
 
# Follow redirects
curl -L https://example.com
 
# Custom headers
curl -H "Authorization: Bearer token" https://api.example.com
 
# POST with data
curl -X POST -d '{"key":"value"}' -H "Content-Type: application/json" https://api.example.com
 
# Time the request
curl -w "@curl-format.txt" -o /dev/null -s https://example.com
 
# curl-format.txt:
#     time_namelookup:  %{time_namelookup}s\n
#        time_connect:  %{time_connect}s\n
#     time_appconnect:  %{time_appconnect}s\n
#        time_total:    %{time_total}s\n

Common Interview Scenarios

"What happens when you type google.com in a browser?"

This classic question tests end-to-end understanding:

  1. URL Parsing: Browser extracts protocol (https), hostname (google.com), path (/)

  2. DNS Resolution:

    • Check browser cache
    • Check OS cache
    • Query DNS resolver
    • Recursive lookup through root → TLD → authoritative
    • Cache result based on TTL
  3. TCP Connection:

    • Three-way handshake (SYN, SYN-ACK, ACK)
    • Connection to IP on port 443
  4. TLS Handshake:

    • Client Hello (supported ciphers)
    • Server Hello (chosen cipher, certificate)
    • Certificate verification
    • Key exchange
    • Encrypted channel established
  5. HTTP Request:

    • GET / HTTP/2
    • Headers (Host, User-Agent, Accept, etc.)
  6. Server Processing:

    • Load balancer routes request
    • Web server processes
    • Backend calls if needed
    • Response generated
  7. Response:

    • Status code (200 OK)
    • Headers (Content-Type, Cache-Control)
    • Body (HTML)
  8. Rendering:

    • Parse HTML
    • Fetch CSS, JS, images (parallel requests)
    • Build DOM and CSSOM
    • Execute JavaScript
    • Paint to screen

Debugging Connectivity Issues

Systematic approach:

# 1. Can we resolve the hostname?
dig api.example.com
# If NXDOMAIN → DNS issue
 
# 2. Can we reach the IP?
ping 93.184.216.34
# If timeout → routing/firewall issue
 
# 3. Can we reach the port?
nc -zv 93.184.216.34 443
# If refused → service not running or firewall
 
# 4. Is TLS working?
openssl s_client -connect api.example.com:443
# If handshake fails → certificate issue
 
# 5. Does HTTP work?
curl -v https://api.example.com/health
# If error → application issue

Designing for High Availability

Key patterns:

  1. Multiple availability zones: Servers in different data centers
  2. Load balancer with health checks: Automatically remove failed instances
  3. DNS failover: Route53 health checks, multiple A records
  4. Connection draining: Graceful shutdown for existing connections
  5. Retry with backoff: Client-side resilience
High Availability Architecture:

       ┌─────────────┐
       │   Route53   │ (DNS with health checks)
       └──────┬──────┘
              │
       ┌──────▼──────┐
       │     ALB     │ (Cross-zone load balancing)
       └──────┬──────┘
              │
    ┌─────────┼─────────┐
    │         │         │
┌───▼───┐ ┌───▼───┐ ┌───▼───┐
│ AZ-1  │ │ AZ-2  │ │ AZ-3  │
│ ┌───┐ │ │ ┌───┐ │ │ ┌───┐ │
│ │App│ │ │ │App│ │ │ │App│ │
│ └───┘ │ │ └───┘ │ │ └───┘ │
└───────┘ └───────┘ └───────┘

Quick Reference

Essential Commands

TaskCommand
DNS lookupdig example.com
Trace routemtr example.com
Test portnc -zv host port
Show connectionsss -tulpn
Capture packetstcpdump -i eth0
HTTP requestcurl -v https://example.com
Check certificateopenssl s_client -connect host:443

Common Ports Quick Reference

22   SSH         443  HTTPS       6379  Redis
80   HTTP        3306 MySQL       27017 MongoDB
53   DNS         5432 PostgreSQL  9200  Elasticsearch
25   SMTP        8080 Alt HTTP    2379  etcd

Troubleshooting Checklist

□ DNS resolving correctly?
□ IP reachable (ping)?
□ Port open (nc/telnet)?
□ Firewall rules allow traffic?
□ Service running on target?
□ TLS certificate valid?
□ Application responding?
□ Correct response code?

Related Articles

This guide connects to the broader DevOps interview preparation:

Infrastructure:

Cloud Platforms:

Architecture:


Final Thoughts

Networking interviews test practical knowledge—not memorized theory. Interviewers want to see:

  1. Systematic debugging: Work through layers methodically
  2. Tool familiarity: Know dig, curl, tcpdump, netstat
  3. Protocol understanding: TCP vs UDP, HTTP status codes, DNS records
  4. Security awareness: Firewalls, encryption, segmentation

The best preparation is hands-on practice. Break things, debug them, understand why they failed. That experience shows in interviews.

Ready to ace your interview?

Get 550+ interview questions with detailed answers in our comprehensive PDF guides.

View PDF Guides