Microservices architecture is one of those topics where the interview reveals whether you've actually built distributed systems or just read about them. Anyone can recite the definition - "independently deployable services organized around business capabilities." The real questions probe deeper: How do you handle a transaction spanning three services? What happens when the payment service is slow? How do you debug a request that touches twelve services?
This guide covers microservices at the depth interviewers expect from senior developers. Not just patterns and definitions, but the trade-offs, failure modes, and practical decisions that matter in production.
Table of Contents
- Microservices Fundamentals Questions
- Monolith vs Microservices Questions
- Service Communication Questions
- API Gateway Questions
- Service Discovery Questions
- Circuit Breaker Questions
- Resilience Pattern Questions
- Database Per Service Questions
- Saga Pattern Questions
- CQRS Questions
- Spring Cloud Questions
- Deployment and Observability Questions
- Microservices Testing Questions
Microservices Fundamentals Questions
Understanding the core principles of microservices helps you make better architectural decisions.
What are microservices and what problems do they solve?
Microservices are an architectural style where an application is built as a collection of small, autonomous services, each running in its own process and communicating through lightweight mechanisms like HTTP or messaging. Each service is independently deployable, scalable, and owned by a small team.
The key problems microservices solve include organizational scaling (multiple teams can work independently), technical scaling (different components can scale separately), and technology flexibility (each service can use the best tool for its job). However, they introduce distributed system complexity that must be carefully managed.
Key characteristics:
| Characteristic | Description |
|---|---|
| Single responsibility | Each service does one thing well |
| Independent deployment | Deploy without coordinating with other services |
| Decentralized data | Each service owns its data store |
| Smart endpoints, dumb pipes | Business logic in services, not middleware |
| Design for failure | Assume network is unreliable |
| Evolutionary design | Services can be rewritten/replaced |
What are bounded contexts and how do they relate to microservices?
Bounded contexts come from Domain-Driven Design (DDD) and represent areas where a particular domain model applies consistently. Within a bounded context, terms have specific meanings that may differ from other contexts. For example, "Product" in a Catalog context means something different than "Product" in a Shipping context.
Services should align with bounded contexts because this creates natural service boundaries. When you decompose by bounded context, each service has a coherent domain model, clear ownership, and minimal need to coordinate with other services for its core functionality.
flowchart TB
subgraph Domain["E-Commerce Domain"]
direction TB
subgraph Catalog["Catalog Context"]
C1["Product"]
C2["Category"]
C3["Price"]
end
subgraph Orders["Orders Context"]
O1["Order"]
O2["LineItem"]
O3["Customer"]
end
subgraph Shipping["Shipping Context"]
S1["Shipment"]
S2["Tracking"]
S3["Address"]
end
end
Note["'Product' means different things in each context"]
style Domain fill:#1e1b4b,stroke:#a855f7,stroke-width:2px
style Catalog fill:#6366f1,stroke:#a855f7
style Orders fill:#6366f1,stroke:#a855f7
style Shipping fill:#6366f1,stroke:#a855f7
style Note fill:#374151,stroke:#a855f7,stroke-width:1pxMonolith vs Microservices Questions
Choosing between monolith and microservices is one of the most important architectural decisions.
What is the difference between monolithic and microservices architecture?
A monolithic architecture packages all functionality into a single deployable unit with a shared database, while microservices split functionality into independent services, each with its own database. In a monolith, all components scale together and share the same technology stack. In microservices, components scale independently and can use different technologies.
The key trade-off is simplicity versus flexibility. Monoliths are simpler to develop, test, and deploy initially, but become harder to maintain as they grow. Microservices offer more flexibility and scalability but introduce distributed system complexity from day one.
Monolithic architecture:
flowchart TB
subgraph Monolith["Monolith Application"]
direction TB
Users["Users"]
Orders["Orders"]
Payments["Payments"]
DB[("Shared Database")]
Users & Orders & Payments --> DB
end
style Monolith fill:#1e1b4b,stroke:#a855f7,stroke-width:2px
style Users fill:#6366f1,stroke:#a855f7
style Orders fill:#6366f1,stroke:#a855f7
style Payments fill:#6366f1,stroke:#a855f7
style DB fill:#7c3aed,stroke:#a855f7Microservices architecture:
flowchart TB
subgraph Services["Microservices"]
direction LR
US["Users Service"]
OS["Orders Service"]
PS["Payments Service"]
US <--> OS <--> PS
end
subgraph Databases["Independent Databases"]
direction LR
UDB[("Users DB")]
ODB[("Orders DB")]
PDB[("Payments DB")]
end
US --> UDB
OS --> ODB
PS --> PDB
style Services fill:#1e1b4b,stroke:#a855f7,stroke-width:2px
style Databases fill:#242833,stroke:#a855f7,stroke-width:2px
style US fill:#6366f1,stroke:#a855f7
style OS fill:#6366f1,stroke:#a855f7
style PS fill:#6366f1,stroke:#a855f7
style UDB fill:#7c3aed,stroke:#a855f7
style ODB fill:#7c3aed,stroke:#a855f7
style PDB fill:#7c3aed,stroke:#a855f7When should you choose microservices over a monolith?
The decision to adopt microservices should be driven by organizational and technical needs, not by following trends. Microservices make sense when the benefits of independent deployment and scaling outweigh the complexity costs of distributed systems.
The worst mistake is premature decomposition. Start with a well-structured monolith, then extract services when specific pain points emerge. Martin Fowler calls this "MonolithFirst" - you can always decompose later, but it's very hard to merge poorly designed microservices back together.
Choose microservices when:
- Multiple teams need to deploy independently
- Different components have vastly different scaling needs
- You need technology diversity (Python for ML, Java for transactions)
- The domain is complex enough to warrant bounded contexts
- You have DevOps maturity (CI/CD, monitoring, container orchestration)
Stick with monolith when:
- Small team (< 10 developers)
- Simple domain
- Unclear service boundaries
- Limited DevOps capability
- Startup exploring product-market fit
Service Communication Questions
How services communicate is fundamental to microservices design.
What are the different ways microservices can communicate?
Microservices communicate through two main patterns: synchronous and asynchronous. Synchronous communication (REST, gRPC) requires both services to be available simultaneously and adds to request latency. Asynchronous communication (message queues, events) decouples services temporally - the producer doesn't wait for the consumer.
The choice depends on your consistency and latency requirements. Use synchronous for operations needing immediate response (checking inventory before confirming order). Use asynchronous for operations that can be processed later (sending confirmation email after order placed).
| Aspect | Synchronous | Asynchronous |
|---|---|---|
| Coupling | Temporal coupling (both must be available) | Decoupled (producer doesn't wait) |
| Latency | Adds to request latency | Non-blocking |
| Consistency | Immediate | Eventual |
| Failure handling | Immediate feedback | Requires dead letter queues |
| Debugging | Easier to trace | Requires distributed tracing |
How do you implement REST communication between microservices in Spring?
Spring provides several options for REST communication: RestTemplate (legacy but still widely used), WebClient (reactive and preferred for new code), and OpenFeign (declarative interface-based client). Each has trade-offs between simplicity, performance, and features.
WebClient is the modern choice because it supports both blocking and non-blocking calls, handles backpressure, and integrates well with reactive pipelines. OpenFeign is excellent when you want clean, interface-based clients that look like local method calls.
// Using RestTemplate (legacy)
@Service
public class OrderService {
private final RestTemplate restTemplate;
public UserDTO getUser(Long userId) {
return restTemplate.getForObject(
"http://user-service/api/users/{id}",
UserDTO.class,
userId
);
}
}
// Using WebClient (reactive, preferred)
@Service
public class OrderService {
private final WebClient webClient;
public Mono<UserDTO> getUser(Long userId) {
return webClient.get()
.uri("http://user-service/api/users/{id}", userId)
.retrieve()
.bodyToMono(UserDTO.class);
}
}
// Using OpenFeign (declarative)
@FeignClient(name = "user-service")
public interface UserClient {
@GetMapping("/api/users/{id}")
UserDTO getUser(@PathVariable Long id);
@PostMapping("/api/users")
UserDTO createUser(@RequestBody CreateUserRequest request);
}When should you use gRPC instead of REST?
gRPC is a high-performance RPC framework that uses Protocol Buffers for serialization and HTTP/2 for transport. It's significantly faster than REST due to binary serialization and multiplexed connections. Use gRPC for internal service-to-service communication where performance matters.
The trade-off is complexity and tooling. REST is universally understood, easy to debug with curl, and works everywhere. gRPC requires code generation, special tooling for debugging, and doesn't work directly in browsers. Most teams use REST for external APIs and gRPC for internal high-throughput communication.
// user.proto
syntax = "proto3";
service UserService {
rpc GetUser (GetUserRequest) returns (User);
rpc CreateUser (CreateUserRequest) returns (User);
}
message GetUserRequest {
int64 id = 1;
}
message User {
int64 id = 1;
string email = 2;
string name = 3;
}// gRPC client
@Service
public class OrderService {
private final UserServiceGrpc.UserServiceBlockingStub userStub;
public User getUser(long userId) {
GetUserRequest request = GetUserRequest.newBuilder()
.setId(userId)
.build();
return userStub.getUser(request);
}
}| Protocol | Use When |
|---|---|
| REST | External APIs, simple CRUD, wide compatibility |
| gRPC | Internal services, high throughput, streaming |
| GraphQL | Client needs flexible queries, multiple frontends |
How do you implement asynchronous communication with message queues?
Asynchronous communication uses message brokers (Kafka, RabbitMQ) to decouple services. The producer publishes events without waiting for consumers. This enables loose coupling, better scalability, and resilience - if a consumer is down, messages queue up and are processed when it recovers.
Spring Cloud Stream provides an abstraction over message brokers, making it easy to switch between Kafka and RabbitMQ. Events should be immutable facts about what happened (OrderCreated), not commands (CreateOrder). This allows multiple consumers to react independently.
flowchart LR
OS["Order Service"] --> MQ["Message Queue"]
MQ --> IS["Inventory Service"]
MQ --> SS["Shipping Service"]
style OS fill:#6366f1,stroke:#a855f7,stroke-width:2px
style MQ fill:#7c3aed,stroke:#a855f7,stroke-width:2px
style IS fill:#6366f1,stroke:#a855f7,stroke-width:2px
style SS fill:#6366f1,stroke:#a855f7,stroke-width:2px// Publishing events (Spring Cloud Stream / Kafka)
@Service
public class OrderService {
private final StreamBridge streamBridge;
@Transactional
public Order createOrder(CreateOrderRequest request) {
Order order = orderRepository.save(new Order(request));
// Publish event after successful save
OrderCreatedEvent event = new OrderCreatedEvent(
order.getId(),
order.getUserId(),
order.getItems()
);
streamBridge.send("orders-out", event);
return order;
}
}
// Consuming events
@Component
public class InventoryEventHandler {
@KafkaListener(topics = "orders")
public void handleOrderCreated(OrderCreatedEvent event) {
for (OrderItem item : event.getItems()) {
inventoryService.reserve(item.getProductId(), item.getQuantity());
}
}
}API Gateway Questions
The API Gateway is the single entry point for external clients.
What is an API Gateway and what are its responsibilities?
An API Gateway is a server that acts as the single entry point for all client requests. Instead of clients calling individual microservices directly, they call the gateway which routes requests to appropriate services. This centralizes cross-cutting concerns and simplifies client code.
The gateway handles responsibilities that would otherwise be duplicated across services: authentication, rate limiting, request routing, protocol translation, and response aggregation. It also provides a stable API even as backend services evolve, acting as a facade that shields clients from internal complexity.
flowchart TB
Clients["Clients"]
Gateway["API Gateway"]
US["Users Service"]
OS["Orders Service"]
PS["Products Service"]
Clients --> Gateway
Gateway --> US
Gateway --> OS
Gateway --> PS
style Clients fill:#374151,stroke:#a855f7,stroke-width:2px
style Gateway fill:#7c3aed,stroke:#a855f7,stroke-width:2px
style US fill:#6366f1,stroke:#a855f7,stroke-width:2px
style OS fill:#6366f1,stroke:#a855f7,stroke-width:2px
style PS fill:#6366f1,stroke:#a855f7,stroke-width:2pxGateway responsibilities:
- Routing: Direct requests to appropriate services
- Authentication: Validate tokens, enforce security
- Rate limiting: Protect services from overload
- Load balancing: Distribute traffic across instances
- Response aggregation: Combine responses from multiple services
- Protocol translation: REST to gRPC, etc.
How do you configure Spring Cloud Gateway?
Spring Cloud Gateway is the modern API gateway for Spring applications, replacing the older Zuul gateway. It's built on Spring WebFlux, providing non-blocking, reactive routing. Routes are configured either in YAML or programmatically using a fluent Java API.
The gateway integrates with service discovery (Eureka, Consul) for dynamic routing - the lb:// prefix enables load-balanced calls to registered services. Filters modify requests and responses, and you can add circuit breakers for resilience.
// Spring Cloud Gateway configuration
@Configuration
public class GatewayConfig {
@Bean
public RouteLocator customRoutes(RouteLocatorBuilder builder) {
return builder.routes()
.route("users", r -> r
.path("/api/users/**")
.filters(f -> f
.stripPrefix(1)
.addRequestHeader("X-Request-Source", "gateway"))
.uri("lb://user-service"))
.route("orders", r -> r
.path("/api/orders/**")
.filters(f -> f
.stripPrefix(1)
.circuitBreaker(c -> c
.setName("ordersCircuitBreaker")
.setFallbackUri("forward:/fallback/orders")))
.uri("lb://order-service"))
.build();
}
}Service Discovery Questions
In dynamic environments, services come and go. Discovery solves "where is service X?"
What is service discovery and why do microservices need it?
Service discovery is a mechanism that allows services to find each other without hardcoded network addresses. In dynamic environments where services scale up/down and move between hosts, you can't rely on static configuration. Services register themselves with a registry on startup and query the registry to find other services.
Without service discovery, you'd need to manually configure every service with the addresses of every other service it calls - and update all configurations whenever anything changes. Service discovery automates this, enabling dynamic scaling, rolling deployments, and failover.
flowchart TB
subgraph Registry["Service Registry"]
direction TB
subgraph UserSvc["user-service"]
U1["192.168.1.10:8080 healthy"]
U2["192.168.1.11:8080 healthy"]
end
subgraph OrderSvc["order-service"]
O1["192.168.1.20:8080 healthy"]
O2["192.168.1.21:8080 unhealthy"]
end
end
style Registry fill:#1e1b4b,stroke:#a855f7,stroke-width:2px
style UserSvc fill:#6366f1,stroke:#a855f7,stroke-width:2px
style OrderSvc fill:#6366f1,stroke:#a855f7,stroke-width:2px
style U1 fill:#22c55e,stroke:#16a34a
style U2 fill:#22c55e,stroke:#16a34a
style O1 fill:#22c55e,stroke:#16a34a
style O2 fill:#ef4444,stroke:#dc2626How do you configure service registration with Eureka?
Netflix Eureka is a popular service registry for Spring Cloud applications. Services register themselves as Eureka clients, sending heartbeats to indicate they're alive. The registry removes instances that stop sending heartbeats after a configurable period.
Configuration is straightforward: add the Eureka client dependency, enable discovery with an annotation, and configure the registry URL. Eureka supports multiple instances for high availability - clients can register with and query from any instance.
# application.yml
spring:
application:
name: order-service
eureka:
client:
service-url:
defaultZone: http://eureka-server:8761/eureka/
instance:
prefer-ip-address: true
lease-renewal-interval-in-seconds: 10
lease-expiration-duration-in-seconds: 30@SpringBootApplication
@EnableDiscoveryClient
public class OrderServiceApplication {
public static void main(String[] args) {
SpringApplication.run(OrderServiceApplication.class, args);
}
}What is the difference between client-side and server-side service discovery?
In client-side discovery, the client queries the service registry, receives a list of available instances, and chooses one using a load balancing algorithm. The client is responsible for selecting the instance. In server-side discovery, the client calls a load balancer which queries the registry and routes the request.
Client-side gives more control over load balancing and can implement sticky sessions, but requires a discovery-aware client library. Server-side is simpler for clients and language-agnostic, but adds a network hop. In Kubernetes environments, server-side discovery through Services is the standard approach.
Client-side discovery:
flowchart TB
Client["Client"]
Registry["Registry"]
I1["Instance 1"]
I2["Instance 2"]
Client -->|"1. Query registry"| Registry
Client -->|"2. Choose instance"| I1
Client -.->|"alternative"| I2
style Client fill:#6366f1,stroke:#a855f7,stroke-width:2px
style Registry fill:#7c3aed,stroke:#a855f7,stroke-width:2px
style I1 fill:#22c55e,stroke:#16a34a,stroke-width:2px
style I2 fill:#22c55e,stroke:#16a34a,stroke-width:2px// Spring Cloud LoadBalancer (client-side)
@Configuration
public class LoadBalancerConfig {
@Bean
@LoadBalanced // Enables client-side load balancing
public RestTemplate restTemplate() {
return new RestTemplate();
}
}
// Usage - "user-service" resolved via discovery
restTemplate.getForObject("http://user-service/api/users/1", UserDTO.class);Server-side discovery (Kubernetes):
flowchart TB
Client["Client"]
LB["Load Balancer"]
I1["Instance 1"]
I2["Instance 2"]
Client -->|"1. Call load balancer"| LB
LB -->|"2. Routes to instance"| I1
LB -.->|"or"| I2
style Client fill:#6366f1,stroke:#a855f7,stroke-width:2px
style LB fill:#7c3aed,stroke:#a855f7,stroke-width:2px
style I1 fill:#22c55e,stroke:#16a34a,stroke-width:2px
style I2 fill:#22c55e,stroke:#16a34a,stroke-width:2px# Kubernetes Service (server-side discovery)
apiVersion: v1
kind: Service
metadata:
name: user-service
spec:
selector:
app: user-service
ports:
- port: 80
targetPort: 8080
type: ClusterIPCircuit Breaker Questions
Circuit breakers prevent cascading failures in distributed systems.
What is the circuit breaker pattern and why is it important?
The circuit breaker pattern prevents an application from repeatedly calling a failing service. Like an electrical circuit breaker, it "trips" when failures exceed a threshold, stopping further calls until the service recovers. This prevents cascading failures where one slow service brings down the entire system.
Without circuit breakers, a failing downstream service causes requests to pile up, consuming threads and connections. Eventually the calling service also fails, and the failure cascades through the system. Circuit breakers fail fast, freeing resources and allowing graceful degradation.
stateDiagram-v2
direction LR
[*] --> CLOSED
CLOSED --> OPEN: failures > threshold
OPEN --> HALF_OPEN: timeout expires
HALF_OPEN --> CLOSED: test succeeds
HALF_OPEN --> OPEN: test fails
state CLOSED {
[*] --> Normal
Normal: โ
Requests pass through
}
state OPEN {
[*] --> Failing
Failing: ๐ซ Requests fail fast
}
state HALF_OPEN {
[*] --> Testing
Testing: ๐ Test requests allowed
}How do you implement a circuit breaker with Resilience4j?
Resilience4j is the modern choice for circuit breakers in Spring applications, replacing Netflix Hystrix which is in maintenance mode. It provides a lightweight, functional approach with annotations or programmatic API. The circuit breaker tracks success/failure rates and transitions between states automatically.
Configuration defines thresholds for failure rate, slow call rate, and timing. The fallback method provides degraded functionality when the circuit is open - this might return cached data, a default value, or an error message explaining the service is temporarily unavailable.
// Resilience4j Circuit Breaker
@Service
public class OrderService {
private final CircuitBreaker circuitBreaker;
private final UserClient userClient;
public OrderService(CircuitBreakerRegistry registry, UserClient userClient) {
this.circuitBreaker = registry.circuitBreaker("userService");
this.userClient = userClient;
}
public UserDTO getUser(Long userId) {
return circuitBreaker.executeSupplier(() -> userClient.getUser(userId));
}
// Or with annotations
@CircuitBreaker(name = "userService", fallbackMethod = "getUserFallback")
public UserDTO getUserWithAnnotation(Long userId) {
return userClient.getUser(userId);
}
private UserDTO getUserFallback(Long userId, Exception ex) {
log.warn("Fallback for user {}: {}", userId, ex.getMessage());
return new UserDTO(userId, "Unknown", "unknown@example.com");
}
}resilience4j:
circuitbreaker:
instances:
userService:
sliding-window-size: 10
failure-rate-threshold: 50
wait-duration-in-open-state: 30s
permitted-number-of-calls-in-half-open-state: 3
slow-call-rate-threshold: 80
slow-call-duration-threshold: 2sResilience Pattern Questions
Beyond circuit breakers, several patterns help build resilient microservices.
What is the retry pattern and when should you use it?
The retry pattern automatically retries failed operations that might succeed on subsequent attempts. It's effective for transient failures like network glitches, temporary service unavailability, or database connection issues. The key is identifying which failures are retryable.
Configure retries with exponential backoff to avoid overwhelming a recovering service. Set maximum attempts to prevent infinite loops. Importantly, only retry idempotent operations - retrying a non-idempotent operation like "charge credit card" could result in duplicate charges.
@Retry(name = "userService", fallbackMethod = "getUserFallback")
public UserDTO getUser(Long userId) {
return userClient.getUser(userId);
}resilience4j:
retry:
instances:
userService:
max-attempts: 3
wait-duration: 500ms
exponential-backoff-multiplier: 2
retry-exceptions:
- java.io.IOException
- java.net.SocketTimeoutException
ignore-exceptions:
- com.example.BusinessExceptionWhat is the bulkhead pattern?
The bulkhead pattern isolates failures by limiting concurrent calls to a service. Named after ship bulkheads that prevent a hull breach from sinking the entire ship, this pattern ensures that problems with one service don't exhaust resources needed for others.
Resilience4j offers two bulkhead types: semaphore (limits concurrent calls) and thread pool (isolates calls in separate threads). Thread pool bulkhead provides stronger isolation but has higher overhead. Use semaphore for most cases, thread pool when you need complete isolation.
@Bulkhead(name = "userService", type = Bulkhead.Type.THREADPOOL)
public CompletableFuture<UserDTO> getUser(Long userId) {
return CompletableFuture.supplyAsync(() -> userClient.getUser(userId));
}resilience4j:
bulkhead:
instances:
userService:
max-concurrent-calls: 20
max-wait-duration: 500ms
thread-pool-bulkhead:
instances:
userService:
max-thread-pool-size: 10
core-thread-pool-size: 5
queue-capacity: 20How do you combine multiple resilience patterns?
In production, you typically combine circuit breaker, retry, bulkhead, and timeout patterns. The order matters - patterns are applied from innermost to outermost. A typical combination: timeout wraps the actual call, bulkhead limits concurrency, retry handles transient failures, circuit breaker prevents repeated calls to a failing service.
The execution order when using annotations is: TimeLimiter โ Bulkhead โ Retry โ CircuitBreaker. This means the circuit breaker sees retried attempts as separate calls, and the bulkhead limits how many retries can happen concurrently.
@CircuitBreaker(name = "userService", fallbackMethod = "fallback")
@Retry(name = "userService")
@Bulkhead(name = "userService")
@TimeLimiter(name = "userService")
public CompletableFuture<UserDTO> getUser(Long userId) {
return CompletableFuture.supplyAsync(() -> userClient.getUser(userId));
}
// Execution order: TimeLimiter โ Bulkhead โ Retry โ CircuitBreakerresilience4j:
timelimiter:
instances:
userService:
timeout-duration: 2s
cancel-running-future: trueDatabase Per Service Questions
Data management is the hardest problem in microservices.
Why should each microservice have its own database?
The database-per-service pattern gives each service exclusive ownership of its data store. No other service can access it directly - they must go through the owning service's API. This enables true independence: services can evolve their schemas, choose appropriate database technologies, and scale data storage independently.
The alternative - shared database - creates tight coupling. Schema changes affect multiple services. Performance problems in one service impact others. Teams can't deploy independently. While database-per-service is harder initially, it's essential for realizing microservices benefits.
flowchart TB
subgraph Services["Microservices"]
US["Users Service"]
OS["Orders Service"]
PS["Products Service"]
end
US --> MySQL[("MySQL")]
OS --> Postgres[("Postgres")]
PS --> MongoDB[("MongoDB")]
style Services fill:#1e1b4b,stroke:#a855f7,stroke-width:2px
style US fill:#6366f1,stroke:#a855f7,stroke-width:2px
style OS fill:#6366f1,stroke:#a855f7,stroke-width:2px
style PS fill:#6366f1,stroke:#a855f7,stroke-width:2px
style MySQL fill:#0ea5e9,stroke:#0284c7,stroke-width:2px
style Postgres fill:#3b82f6,stroke:#2563eb,stroke-width:2px
style MongoDB fill:#22c55e,stroke:#16a34a,stroke-width:2pxBenefits:
- Services are truly independent
- Can choose best database for use case
- Schema changes don't affect other services
- Independent scaling
Challenges:
- No joins across services
- Data duplication
- Consistency is hard
How do you handle queries that need data from multiple services?
When you need data from multiple services, you have several options. API Composition has a service (often the API Gateway) call multiple services and combine results. CQRS maintains denormalized read models optimized for specific queries. Event-driven synchronization keeps local copies of needed data.
The best choice depends on consistency requirements and query patterns. API Composition is simplest but adds latency. CQRS provides fast reads but requires eventual consistency. Local copies work well for slowly-changing reference data.
Saga Pattern Questions
Sagas manage distributed transactions across microservices.
What is the saga pattern and why is it needed?
The saga pattern manages transactions spanning multiple services without distributed transactions (2PC). Instead of one ACID transaction, a saga is a sequence of local transactions. Each service performs its local transaction and publishes an event. If any step fails, compensating transactions undo the previous steps.
Traditional distributed transactions don't work well in microservices because they require locks across services, creating tight coupling and scalability problems. Sagas embrace eventual consistency - the system is temporarily inconsistent during the saga but reaches consistency when it completes.
What is the difference between choreography and orchestration in sagas?
Choreography is decentralized - services react to events without a central coordinator. Each service knows what events to listen for and what events to publish. Orchestration uses a central saga coordinator that tells each service what to do and tracks the overall state.
Choreography is simpler for small sagas and promotes loose coupling, but becomes hard to understand as sagas grow. Orchestration provides clear visibility into saga state and is easier to test, but the coordinator can become a bottleneck.
Choreography (event-driven):
flowchart LR
subgraph Saga["Event-Driven Saga"]
OS["Order Service"]
IS["Inventory Service"]
PS["Payment Service"]
end
OS -->|"OrderCreated"| IS
IS -->|"InventoryReserved"| PS
PS -->|"PaymentProcessed"| OS
style Saga fill:#1e1b4b,stroke:#a855f7,stroke-width:2px
style OS fill:#6366f1,stroke:#a855f7,stroke-width:2px
style IS fill:#6366f1,stroke:#a855f7,stroke-width:2px
style PS fill:#6366f1,stroke:#a855f7,stroke-width:2px// Order Service - starts saga
@Service
public class OrderService {
@Transactional
public Order createOrder(CreateOrderRequest request) {
Order order = orderRepository.save(new Order(request, OrderStatus.PENDING));
eventPublisher.publish(new OrderCreatedEvent(order));
return order;
}
@EventHandler
public void on(PaymentProcessedEvent event) {
Order order = orderRepository.findById(event.getOrderId()).orElseThrow();
order.setStatus(OrderStatus.CONFIRMED);
orderRepository.save(order);
eventPublisher.publish(new OrderConfirmedEvent(order));
}
@EventHandler
public void on(PaymentFailedEvent event) {
// Compensating action
Order order = orderRepository.findById(event.getOrderId()).orElseThrow();
order.setStatus(OrderStatus.CANCELLED);
orderRepository.save(order);
eventPublisher.publish(new OrderCancelledEvent(order));
}
}
// Inventory Service - reacts to events
@Service
public class InventoryService {
@EventHandler
public void on(OrderCreatedEvent event) {
try {
reserveInventory(event.getItems());
eventPublisher.publish(new InventoryReservedEvent(event.getOrderId()));
} catch (InsufficientInventoryException e) {
eventPublisher.publish(new InventoryReservationFailedEvent(event.getOrderId()));
}
}
@EventHandler
public void on(OrderCancelledEvent event) {
// Compensating action - release reserved inventory
releaseInventory(event.getOrderId());
}
}Orchestration (central coordinator):
flowchart TB
Orch["Saga Orchestrator"]
OS["Order Service"]
IS["Inventory Service"]
PS["Payment Service"]
Orch -->|"1. Create"| OS
Orch -->|"2. Reserve"| IS
Orch -->|"3. Charge"| PS
style Orch fill:#7c3aed,stroke:#a855f7,stroke-width:2px
style OS fill:#6366f1,stroke:#a855f7,stroke-width:2px
style IS fill:#6366f1,stroke:#a855f7,stroke-width:2px
style PS fill:#6366f1,stroke:#a855f7,stroke-width:2px// Saga Orchestrator
@Service
public class CreateOrderSaga {
public Order execute(CreateOrderRequest request) {
SagaExecution saga = SagaExecution.start();
try {
// Step 1: Create order
Order order = orderService.createOrder(request);
saga.addCompensation(() -> orderService.cancelOrder(order.getId()));
// Step 2: Reserve inventory
inventoryService.reserve(order.getItems());
saga.addCompensation(() -> inventoryService.release(order.getId()));
// Step 3: Process payment
paymentService.process(order.getId(), order.getTotal());
saga.addCompensation(() -> paymentService.refund(order.getId()));
// Step 4: Confirm order
orderService.confirm(order.getId());
return order;
} catch (Exception e) {
saga.compensate(); // Run compensations in reverse order
throw new SagaFailedException(e);
}
}
}When should you choose choreography over orchestration?
Choose choreography for simple sagas with few steps where services should remain highly decoupled. It has no single point of failure and each service is autonomous. The risk is that saga state becomes distributed and hard to track.
Choose orchestration for complex sagas with many steps where you need clear visibility into progress. It's easier to test and debug. The risk is that the orchestrator becomes a bottleneck or single point of failure.
CQRS Questions
CQRS separates read and write operations for better scalability.
What is CQRS and when should you use it?
CQRS (Command Query Responsibility Segregation) separates the read model from the write model. Commands (writes) go to one model optimized for updates, queries (reads) go to another model optimized for retrieval. The read model is updated asynchronously through events from the write model.
Use CQRS when read and write patterns differ significantly, when you need to scale reads independently, when complex queries would slow down the write database, or when using event sourcing. Don't use it for simple CRUD applications - the complexity isn't worth it.
flowchart TB
subgraph CQRS["CQRS Architecture"]
direction TB
subgraph Write["Write Side"]
CMD["Commands"]
WM["Write Model"]
PG[("PostgreSQL")]
CMD --> WM --> PG
end
subgraph Read["Read Side"]
QRY["Queries"]
RM["Read Model"]
ES[("Elasticsearch")]
QRY --> RM --> ES
end
WM -->|"Events"| RM
end
style CQRS fill:#1e1b4b,stroke:#a855f7,stroke-width:2px
style Write fill:#6366f1,stroke:#a855f7,stroke-width:2px
style Read fill:#7c3aed,stroke:#a855f7,stroke-width:2px
style PG fill:#3b82f6,stroke:#2563eb,stroke-width:2px
style ES fill:#f59e0b,stroke:#d97706,stroke-width:2pxHow do you handle eventual consistency in CQRS?
In CQRS, the read model is eventually consistent with the write model - there's a delay between when data is written and when it appears in queries. Design your UI to handle this: show "processing" states, use optimistic updates, or poll for completion.
Accept that users might briefly see stale data. For most applications, milliseconds to seconds of inconsistency is acceptable. If you need stronger consistency for specific operations, query the write model directly for those cases.
Spring Cloud Questions
Spring Cloud provides tools for common microservices patterns.
What is Spring Cloud Config and how does it work?
Spring Cloud Config provides centralized configuration management for distributed systems. A Config Server serves configuration from a Git repository (or other backends). Services fetch their configuration at startup and can refresh it without restarting.
This solves the problem of managing configuration across many services and environments. Instead of configuring each service separately, you maintain configuration files in Git with all the benefits of version control, audit trails, and pull request workflows.
# Config Server application.yml
spring:
cloud:
config:
server:
git:
uri: https://github.com/company/config-repo
search-paths: '{application}'# Client application.yml
spring:
application:
name: order-service
config:
import: configserver:http://config-server:8888Configuration files in git repo:
config-repo/
โโโ application.yml # Shared by all services
โโโ order-service.yml # Order service specific
โโโ order-service-prod.yml # Order service production
โโโ user-service.yml # User service specific
How do you implement declarative REST clients with OpenFeign?
OpenFeign creates REST clients from annotated interfaces - you declare the API contract, and Feign generates the implementation. This reduces boilerplate and makes service calls look like local method calls. Combined with service discovery, you just use service names instead of URLs.
Feign integrates with Resilience4j for circuit breakers and retry, and with Spring Cloud LoadBalancer for client-side load balancing. Fallback factories provide graceful degradation when services are unavailable.
@FeignClient(
name = "user-service",
fallbackFactory = UserClientFallbackFactory.class
)
public interface UserClient {
@GetMapping("/api/users/{id}")
UserDTO getUser(@PathVariable Long id);
@GetMapping("/api/users")
List<UserDTO> getUsers(@RequestParam List<Long> ids);
@PostMapping("/api/users")
UserDTO createUser(@RequestBody CreateUserRequest request);
}
@Component
public class UserClientFallbackFactory implements FallbackFactory<UserClient> {
@Override
public UserClient create(Throwable cause) {
return new UserClient() {
@Override
public UserDTO getUser(Long id) {
log.error("Fallback for getUser: {}", cause.getMessage());
return new UserDTO(id, "Unknown", "fallback@example.com");
}
@Override
public List<UserDTO> getUsers(List<Long> ids) {
return Collections.emptyList();
}
@Override
public UserDTO createUser(CreateUserRequest request) {
throw new ServiceUnavailableException("User service unavailable");
}
};
}
}How does distributed tracing work with Micrometer and Zipkin?
Distributed tracing tracks requests as they flow through multiple services. Each request gets a unique trace ID that's propagated to all services it touches. Within a trace, each service operation creates a span with timing information. Visualizing traces shows the full request path and where time is spent.
Spring Boot 3 uses Micrometer for metrics and tracing, replacing the older Spring Cloud Sleuth. Configuration is minimal - add dependencies and configure the Zipkin endpoint. Trace context is automatically propagated through HTTP headers.
# application.yml
management:
tracing:
sampling:
probability: 1.0 # Sample all requests (use lower in prod)
zipkin:
tracing:
endpoint: http://zipkin:9411/api/v2/spans// Trace context is automatically propagated
@RestController
public class OrderController {
private static final Logger log = LoggerFactory.getLogger(OrderController.class);
@GetMapping("/orders/{id}")
public Order getOrder(@PathVariable Long id) {
// traceId and spanId automatically included in logs
log.info("Fetching order {}", id);
return orderService.findById(id);
}
}
// Log output includes trace context:
// 2026-01-07 10:30:00 [order-service,abc123,def456] INFO OrderController - Fetching order 42
// ^service ^traceId ^spanIdDeployment and Observability Questions
Operating microservices requires strong DevOps practices.
How do you containerize a Spring Boot microservice?
Containerization packages your application with its dependencies into a portable image. For Spring Boot, you create a Dockerfile that starts from a JRE base image, copies your JAR, and defines the entrypoint. Use slim base images (Alpine) to minimize image size and attack surface.
Spring Boot 2.3+ includes built-in support for building OCI images with ./mvnw spring-boot:build-image, which uses Cloud Native Buildpacks and doesn't require a Dockerfile. For more control, write your own Dockerfile.
# Dockerfile
FROM eclipse-temurin:21-jre-alpine
WORKDIR /app
COPY target/order-service-*.jar app.jar
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]# docker-compose.yml for local development
version: '3.8'
services:
eureka:
image: eureka-server:latest
ports:
- "8761:8761"
config-server:
image: config-server:latest
ports:
- "8888:8888"
depends_on:
- eureka
user-service:
image: user-service:latest
depends_on:
- eureka
- config-server
environment:
- EUREKA_URI=http://eureka:8761/eureka
order-service:
image: order-service:latest
depends_on:
- eureka
- config-server
- user-service
environment:
- EUREKA_URI=http://eureka:8761/eurekaHow do you deploy microservices to Kubernetes?
Kubernetes manages containerized applications at scale. Each service becomes a Deployment (defines how to run your containers) and a Service (provides network access). Kubernetes handles scaling, rolling updates, and self-healing.
Configure health checks (readiness and liveness probes) so Kubernetes knows when your service is ready for traffic and when it needs to be restarted. Set resource requests and limits to ensure fair resource sharing and prevent runaway services from affecting others.
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service
spec:
replicas: 3
selector:
matchLabels:
app: order-service
template:
metadata:
labels:
app: order-service
spec:
containers:
- name: order-service
image: order-service:1.0.0
ports:
- containerPort: 8080
env:
- name: SPRING_PROFILES_ACTIVE
value: kubernetes
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 30
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 60
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"What observability stack do you need for microservices?
Microservices require comprehensive observability: metrics (Prometheus + Grafana), logs (ELK stack or Loki), and traces (Zipkin or Jaeger). Together, these let you understand system behavior, diagnose issues, and track requests across services.
Spring Boot Actuator exposes metrics and health endpoints. Configure structured JSON logging for easy aggregation. Custom health indicators let you include downstream service health in your status.
Metrics (Prometheus + Grafana):
management:
endpoints:
web:
exposure:
include: health,info,prometheus
metrics:
tags:
application: ${spring.application.name}Logging (ELK Stack):
<!-- logback-spring.xml -->
<appender name="JSON" class="ch.qos.logback.core.ConsoleAppender">
<encoder class="net.logstash.logback.encoder.LogstashEncoder">
<includeMdcKeyName>traceId</includeMdcKeyName>
<includeMdcKeyName>spanId</includeMdcKeyName>
</encoder>
</appender>Health Checks:
@Component
public class ExternalServiceHealthIndicator implements HealthIndicator {
@Override
public Health health() {
if (externalServiceIsHealthy()) {
return Health.up()
.withDetail("externalService", "Available")
.build();
}
return Health.down()
.withDetail("externalService", "Unavailable")
.build();
}
}Microservices Testing Questions
Testing distributed systems requires multiple strategies.
How do you handle API versioning in microservices?
API versioning ensures backward compatibility as services evolve. Common approaches include URL versioning (/api/v1/users), header versioning (Accept: application/vnd.api.v1+json), and query parameter versioning (?version=1).
URL versioning is most common because it's visible and easy to understand. Whatever approach you choose, support at least N-1 versions to give consumers time to migrate. Use consumer-driven contract testing to detect breaking changes before deployment.
How do you debug a request that spans multiple services?
Debugging distributed requests requires correlation. Every request gets a unique trace ID that's passed through all service calls. Centralized logging aggregates logs from all services, searchable by trace ID. Distributed tracing tools (Zipkin, Jaeger) visualize the request path with timing.
Start debugging by finding the trace ID (usually in the error response or logs), then search centralized logs for that ID to see all related log entries. Use the tracing UI to see which service calls succeeded or failed and how long each took.
How do you test microservices effectively?
Testing microservices requires a layered approach. Unit tests verify business logic in isolation. Integration tests verify each service works with its database (use Testcontainers for real databases). Contract tests (Pact) verify API compatibility between services. End-to-end tests verify complete flows but should be used sparingly.
The testing pyramid still applies: many unit tests, fewer integration tests, even fewer end-to-end tests. Contract tests are especially important - they catch breaking API changes without requiring all services to be running together.
Related Resources
- Spring Boot Interview Guide - Foundation for Spring Cloud
- Docker Interview Guide - Containerization fundamentals
- Kubernetes Interview Guide - Container orchestration
- System Design Interview Guide - Architectural patterns
- REST API Interview Guide - API design principles
