40+ MongoDB Interview Questions 2025: Mongoose, Aggregation & Schema Design

·19 min read
mongodbinterview-questionsdatabasenodejsmongoosebackend

MongoDB powers over 35% of modern Node.js applications, yet "SQL vs NoSQL" debates still dominate interview rooms. The real question isn't which is better—it's whether you understand when MongoDB is the right choice and how to use it effectively.

This guide covers the most important MongoDB interview questions, from fundamental concepts to advanced aggregation pipelines and schema design patterns.

Table of Contents

  1. MongoDB Fundamentals Questions
  2. Mongoose Schema Questions
  3. Schema Design Questions
  4. CRUD Operations Questions
  5. Aggregation Pipeline Questions
  6. Indexing Questions
  7. Connection and Transaction Questions
  8. Scaling and Performance Questions

MongoDB Fundamentals Questions

Understanding MongoDB's core concepts is essential for any interview involving NoSQL databases.

What are documents and collections in MongoDB?

MongoDB stores data as BSON documents, which are JSON-like structures with additional data types. A collection is analogous to a table in relational databases, containing multiple documents. Unlike SQL rows, documents in the same collection can have different fields and structures.

The key differences from SQL include schema flexibility (documents can have varying fields), first-class support for nested objects and arrays, and a design philosophy that often favors denormalization over JOINs.

// MongoDB stores data as BSON documents (JSON-like)
// Collection = table, Document = row
 
// A document in the "users" collection
{
  _id: ObjectId("507f1f77bcf86cd799439011"),
  name: "Sarah Chen",
  email: "sarah@example.com",
  profile: {
    bio: "Full-stack developer",
    skills: ["Node.js", "MongoDB", "React"]
  },
  createdAt: ISODate("2024-01-15T10:30:00Z")
}

When should you use MongoDB instead of a relational database?

The decision between MongoDB and SQL databases comes down to data structure and access patterns. MongoDB excels when your schema evolves frequently, when you read and write entire documents together, when you need horizontal scaling, or when your data is naturally hierarchical like user profiles or content.

Relational databases are better when you have complex relationships between entities, need unpredictable queries with JOINs, require transactions spanning multiple tables, or when data integrity constraints are paramount. Many production applications use both—SQL for financial transactions and authentication, MongoDB for activity feeds or content storage where flexibility matters more than strict consistency.

How does MongoDB compare to SQL databases?

MongoDB and SQL databases have fundamentally different data models and terminology. Understanding these mappings helps when transitioning between the two or explaining concepts in interviews.

SQLMongoDBMongoose
TableCollectionModel
RowDocumentDocument instance
ColumnFieldSchema field
Primary Key_id (ObjectId)_id
Foreign KeyReference (ObjectId)ref + populate()
JOIN$lookup / populate.populate()
GROUP BY$group.aggregate()
INDEXcreateIndex()schema.index()
Transactionsession.withTransaction()session.withTransaction()

Mongoose Schema Questions

Mongoose provides structure and validation on top of MongoDB's flexible document model.

What is Mongoose and why would you use it?

Mongoose is an ODM (Object Document Mapper) for MongoDB and Node.js that provides schema validation, type casting, query building, middleware hooks, and business logic encapsulation. While MongoDB itself is schema-less, Mongoose adds application-level schema enforcement, making it easier to maintain data consistency and add business logic to your models.

Mongoose simplifies common operations through a fluent query API, adds virtual properties and instance methods to documents, and provides middleware for pre/post hooks on operations like save and validate.

How do you define a Mongoose schema with validation?

A Mongoose schema defines the structure, validation rules, and behavior for documents in a collection. You specify field types, required fields, default values, custom validators, and indexes. The schema also supports virtual properties, instance methods, static methods, and middleware hooks.

const mongoose = require('mongoose');
 
const userSchema = new mongoose.Schema({
  name: {
    type: String,
    required: [true, 'Name is required'],
    trim: true,
    maxlength: 100
  },
  email: {
    type: String,
    required: true,
    unique: true,
    lowercase: true,
    match: [/^\S+@\S+\.\S+$/, 'Invalid email format']
  },
  password: {
    type: String,
    required: true,
    minlength: 8,
    select: false  // Don't include in queries by default
  },
  role: {
    type: String,
    enum: ['user', 'admin', 'moderator'],
    default: 'user'
  },
  profile: {
    bio: { type: String, maxlength: 500 },
    avatar: String,
    skills: [String]
  },
  loginAttempts: { type: Number, default: 0 },
  lockUntil: Date
}, {
  timestamps: true,  // Adds createdAt and updatedAt
  toJSON: { virtuals: true }
});
 
// Indexes for query performance
userSchema.index({ email: 1 });
userSchema.index({ 'profile.skills': 1 });
userSchema.index({ createdAt: -1 });
 
// Virtual property (not stored in DB)
userSchema.virtual('isLocked').get(function() {
  return this.lockUntil && this.lockUntil > Date.now();
});
 
// Instance method
userSchema.methods.comparePassword = async function(candidatePassword) {
  return bcrypt.compare(candidatePassword, this.password);
};
 
// Static method
userSchema.statics.findByEmail = function(email) {
  return this.findOne({ email: email.toLowerCase() });
};
 
// Pre-save middleware
userSchema.pre('save', async function(next) {
  if (!this.isModified('password')) return next();
  this.password = await bcrypt.hash(this.password, 12);
  next();
});
 
const User = mongoose.model('User', userSchema);

How do you ensure data quality without a database schema?

While MongoDB itself is schema-flexible, you enforce data quality at the application level using Mongoose. Mongoose provides type validation and casting, required fields and custom validators, pre/post hooks for business logic, and default values. For critical collections, you can also use MongoDB's built-in JSON Schema validation as a database-level safety net.


Schema Design Questions

Schema design is the most important architectural decision in MongoDB applications.

When should you embed data in MongoDB?

Embedding (denormalization) stores related data in a single document. This approach works best when the data is always accessed together with the parent document, when you have a one-to-few relationship, and when the embedded data rarely changes independently from the parent.

Common examples include embedding addresses in a user document or embedding line items in an order. The benefit is that a single query returns all the data you need without additional lookups.

// GOOD: Embed addresses in user document
// - Accessed together with user
// - One-to-few relationship
// - Rarely updated independently
const userSchema = new mongoose.Schema({
  name: String,
  addresses: [{
    street: String,
    city: String,
    zipCode: String,
    isDefault: Boolean
  }]
});
 
// Query returns everything in one call
const user = await User.findById(userId);
console.log(user.addresses[0].city);

When should you reference data instead of embedding?

Referencing (normalization) stores relationships as ObjectId references to documents in other collections. This approach is better when the data is accessed independently from the parent, when you have one-to-many or many-to-many relationships, when the related data could grow unboundedly, or when documents would exceed MongoDB's 16MB size limit.

Use Mongoose's populate() method or the $lookup aggregation stage to join referenced data when needed.

// GOOD: Reference orders separately
// - Accessed independently
// - One-to-many relationship (user has many orders)
// - Orders grow unboundedly
const orderSchema = new mongoose.Schema({
  user: {
    type: mongoose.Schema.Types.ObjectId,
    ref: 'User',
    required: true,
    index: true
  },
  items: [{
    product: { type: mongoose.Schema.Types.ObjectId, ref: 'Product' },
    quantity: Number,
    price: Number
  }],
  total: Number,
  status: String
});
 
// Populate to join data
const orders = await Order.find({ user: userId })
  .populate('user', 'name email')
  .populate('items.product', 'name price');

What factors determine embedding vs referencing?

The decision between embedding and referencing depends on several factors. Consider the relationship cardinality, how data is accessed, update frequency, document size limits, and whether data duplication is acceptable.

FactorEmbedReference
RelationshipOne-to-fewOne-to-many, Many-to-many
Access patternAlways togetherOften separate
Update frequencyRarely changesChanges independently
Document sizeSmall (< 16MB)Could grow large
Data duplicationAcceptableProblematic

A practical example: embed a user's addresses because they're always fetched with the user profile and there are only a few. Reference orders because a user could have thousands, they're queried independently, and they update frequently.

How would you design a schema for a blog platform?

A blog platform schema demonstrates both embedding and referencing patterns. Users are standalone documents. Posts reference their author but embed a limited number of recent comments for quick display. Full comments live in a separate collection for scalability, supporting features like pagination and threading.

// Users - standalone collection
const userSchema = new Schema({
  username: { type: String, unique: true },
  email: { type: String, unique: true },
  passwordHash: String,
  profile: {
    displayName: String,
    bio: String,
    avatar: String
  }
});
 
// Posts - references author, embeds limited comments
const postSchema = new Schema({
  author: { type: ObjectId, ref: 'User', index: true },
  title: String,
  slug: { type: String, unique: true },
  content: String,
  tags: { type: [String], index: true },
  status: { type: String, enum: ['draft', 'published'] },
  // Embed recent comments (limit to prevent unbounded growth)
  recentComments: [{
    author: { type: ObjectId, ref: 'User' },
    content: String,
    createdAt: Date
  }],
  commentCount: { type: Number, default: 0 },
  viewCount: { type: Number, default: 0 }
}, { timestamps: true });
 
// Full comments - separate collection for scalability
const commentSchema = new Schema({
  post: { type: ObjectId, ref: 'Post', index: true },
  author: { type: ObjectId, ref: 'User' },
  content: String,
  parentComment: { type: ObjectId, ref: 'Comment' }  // For threading
}, { timestamps: true });
 
// Indexes for common queries
postSchema.index({ author: 1, createdAt: -1 });
postSchema.index({ tags: 1, status: 1 });
postSchema.index({ slug: 1 }, { unique: true });

CRUD Operations Questions

Understanding MongoDB's CRUD operations is fundamental for any Node.js developer working with the database.

How do you create documents in MongoDB?

MongoDB provides several methods for creating documents. The create() method inserts a single document with validation, while insertMany() efficiently inserts multiple documents in a single operation. Always handle validation errors and duplicate key violations appropriately.

// Single document
const user = await User.create({
  name: 'John Doe',
  email: 'john@example.com',
  password: 'securePassword123'
});
 
// Multiple documents
const users = await User.insertMany([
  { name: 'Alice', email: 'alice@example.com' },
  { name: 'Bob', email: 'bob@example.com' }
]);
 
// With validation handling
try {
  const user = await User.create(userData);
} catch (error) {
  if (error.code === 11000) {
    // Duplicate key error (unique constraint)
    throw new Error('Email already exists');
  }
  if (error.name === 'ValidationError') {
    // Mongoose validation failed
    const messages = Object.values(error.errors).map(e => e.message);
    throw new Error(messages.join(', '));
  }
  throw error;
}

How do you query documents in MongoDB?

MongoDB offers flexible querying with methods like findById(), findOne(), and find(). Query builders chain methods for selecting fields, sorting, pagination, and performance optimization. The lean() method returns plain JavaScript objects instead of Mongoose documents, which is faster for read-only operations.

// Find one
const user = await User.findById(id);
const user = await User.findOne({ email: 'john@example.com' });
 
// Find many with query builders
const users = await User.find({ role: 'admin' })
  .select('name email createdAt')     // Only these fields
  .sort({ createdAt: -1 })            // Newest first
  .skip(20)                           // Pagination offset
  .limit(10)                          // Page size
  .lean();                            // Return plain objects (faster)
 
// Complex queries
const users = await User.find({
  createdAt: { $gte: new Date('2024-01-01') },
  'profile.skills': { $in: ['Node.js', 'MongoDB'] },
  role: { $ne: 'admin' }
});
 
// Text search (requires text index)
const results = await Product.find(
  { $text: { $search: 'wireless bluetooth' } },
  { score: { $meta: 'textScore' } }
).sort({ score: { $meta: 'textScore' } });

How do you update documents in MongoDB?

MongoDB provides several update methods with different behaviors. The updateOne() method modifies a document in place, while findByIdAndUpdate() returns the modified document. Update operators like $set, $inc, $push, and $pull modify specific fields without replacing the entire document.

// Update one document
const result = await User.updateOne(
  { _id: userId },
  { $set: { 'profile.bio': 'Updated bio' } }
);
 
// Find and update (returns the document)
const user = await User.findByIdAndUpdate(
  userId,
  { $inc: { loginAttempts: 1 } },
  { new: true, runValidators: true }  // Return updated doc, run validators
);
 
// Update operators
await User.updateOne({ _id: userId }, {
  $set: { name: 'New Name' },           // Set field value
  $unset: { tempField: '' },            // Remove field
  $inc: { loginCount: 1 },              // Increment number
  $push: { 'profile.skills': 'GraphQL' }, // Add to array
  $pull: { 'profile.skills': 'jQuery' },  // Remove from array
  $addToSet: { tags: 'verified' }       // Add to array if not exists
});
 
// Bulk updates
await User.updateMany(
  { lastLogin: { $lt: new Date('2023-01-01') } },
  { $set: { status: 'inactive' } }
);

How do you delete documents in MongoDB?

MongoDB supports both hard deletes and soft deletes. Hard deletes permanently remove documents using deleteOne() or deleteMany(). Soft deletes set a deletedAt timestamp, preserving data for recovery. A query middleware can automatically filter out soft-deleted documents.

// Delete one
await User.deleteOne({ _id: userId });
const user = await User.findByIdAndDelete(userId);
 
// Delete many
const result = await User.deleteMany({ status: 'inactive' });
console.log(`Deleted ${result.deletedCount} users`);
 
// Soft delete pattern (preferred for most apps)
const userSchema = new mongoose.Schema({
  // ... other fields
  deletedAt: Date
});
 
userSchema.pre(/^find/, function() {
  this.where({ deletedAt: null });
});
 
userSchema.methods.softDelete = function() {
  this.deletedAt = new Date();
  return this.save();
};

Aggregation Pipeline Questions

The aggregation pipeline is MongoDB's most powerful feature for data analysis and transformation.

What is the MongoDB aggregation pipeline and how does it work?

The aggregation pipeline is a framework for data processing that passes documents through a series of stages, with each stage transforming the data. Documents flow sequentially through stages, with each stage's output becoming the next stage's input. This is MongoDB's answer to SQL's GROUP BY and JOIN operations, but with more flexibility.

// Sales report: total revenue by product category
const report = await Order.aggregate([
  // Stage 1: Filter to completed orders this year
  {
    $match: {
      status: 'completed',
      createdAt: { $gte: new Date('2024-01-01') }
    }
  },
  // Stage 2: Unwind the items array (one doc per item)
  { $unwind: '$items' },
  // Stage 3: Lookup product details
  {
    $lookup: {
      from: 'products',
      localField: 'items.product',
      foreignField: '_id',
      as: 'productInfo'
    }
  },
  // Stage 4: Flatten the lookup result
  { $unwind: '$productInfo' },
  // Stage 5: Group by category
  {
    $group: {
      _id: '$productInfo.category',
      totalRevenue: { $sum: { $multiply: ['$items.quantity', '$items.price'] } },
      totalOrders: { $sum: 1 },
      avgOrderValue: { $avg: { $multiply: ['$items.quantity', '$items.price'] } }
    }
  },
  // Stage 6: Sort by revenue descending
  { $sort: { totalRevenue: -1 } },
  // Stage 7: Reshape output
  {
    $project: {
      category: '$_id',
      totalRevenue: { $round: ['$totalRevenue', 2] },
      totalOrders: 1,
      avgOrderValue: { $round: ['$avgOrderValue', 2] },
      _id: 0
    }
  }
]);

What are the most commonly used aggregation stages?

The aggregation pipeline has many stages, but certain ones appear in almost every pipeline. Understanding these core stages helps you build complex data transformations.

// $match - Filter documents (like WHERE)
{ $match: { status: 'active', age: { $gte: 18 } } }
 
// $group - Aggregate values (like GROUP BY)
{ $group: {
    _id: '$category',
    count: { $sum: 1 },
    avgPrice: { $avg: '$price' },
    maxPrice: { $max: '$price' },
    items: { $push: '$name' }  // Collect into array
}}
 
// $project - Reshape documents (like SELECT)
{ $project: {
    name: 1,
    email: 1,
    fullName: { $concat: ['$firstName', ' ', '$lastName'] },
    year: { $year: '$createdAt' }
}}
 
// $lookup - Join collections (like LEFT JOIN)
{ $lookup: {
    from: 'orders',
    localField: '_id',
    foreignField: 'userId',
    as: 'userOrders'
}}
 
// $unwind - Flatten arrays
{ $unwind: '$tags' }  // One document per tag
 
// $sort, $skip, $limit - Pagination
{ $sort: { createdAt: -1 } },
{ $skip: 20 },
{ $limit: 10 }
 
// $facet - Multiple pipelines in parallel
{ $facet: {
    results: [{ $skip: 0 }, { $limit: 10 }],
    totalCount: [{ $count: 'count' }]
}}

Indexing Questions

Proper indexing is critical for MongoDB performance at scale.

How do you create indexes in MongoDB?

Indexes improve query performance by avoiding full collection scans. You can define indexes in Mongoose schemas or create them programmatically. MongoDB supports single-field indexes, compound indexes (multiple fields), text indexes for full-text search, and TTL indexes that automatically delete documents after a specified time.

// In Mongoose schema
const productSchema = new mongoose.Schema({
  name: { type: String, index: true },        // Single field
  sku: { type: String, unique: true },        // Unique index
  category: String,
  price: Number,
  tags: [String],
  description: String
});
 
// Compound index (queries using both fields)
productSchema.index({ category: 1, price: -1 });
 
// Text index for search
productSchema.index({ name: 'text', description: 'text' });
 
// TTL index (auto-delete after time)
const sessionSchema = new mongoose.Schema({
  userId: ObjectId,
  expiresAt: { type: Date, index: { expires: 0 } }  // Delete when expiresAt passes
});
 
// Programmatically
await Product.collection.createIndex({ category: 1, price: -1 });

What is a good indexing strategy for MongoDB?

A good indexing strategy starts with understanding your query patterns. Use explain() to verify whether queries use indexes (IXSCAN) or perform collection scans (COLLSCAN). For compound indexes, remember that order matters—an index on {a, b, c} supports queries on {a}, {a, b}, and {a, b, c}, but not queries on just {b} or {c}.

Follow the ESR rule for compound index field ordering: Equality fields first, then Sort fields, then Range fields.

// Check if query uses index
const explanation = await User.find({ email: 'test@example.com' })
  .explain('executionStats');
 
console.log(explanation.executionStats.executionStages.stage);
// IXSCAN = using index (good)
// COLLSCAN = collection scan (bad for large collections)
 
// Index intersection vs compound index
// If you query: { category: 'electronics', brand: 'Apple' }
 
// Option 1: Two single indexes (MongoDB may intersect)
productSchema.index({ category: 1 });
productSchema.index({ brand: 1 });
 
// Option 2: Compound index (more efficient for this specific query)
productSchema.index({ category: 1, brand: 1 });
 
// Compound index order matters!
// Index { a: 1, b: 1, c: 1 } supports:
// - Queries on { a }
// - Queries on { a, b }
// - Queries on { a, b, c }
// But NOT queries on just { b } or { c }

Connection and Transaction Questions

Understanding connection management and transactions is essential for production MongoDB applications.

How do you manage MongoDB connections in Node.js?

Proper connection management includes setting pool sizes, handling connection events, and implementing graceful shutdown. Mongoose maintains a connection pool that reuses connections for efficiency. Configure the pool size based on your application's concurrency needs.

// Connection with best practices
const mongoose = require('mongoose');
 
const connectDB = async () => {
  try {
    await mongoose.connect(process.env.MONGODB_URI, {
      maxPoolSize: 10,           // Connection pool size
      serverSelectionTimeoutMS: 5000,
      socketTimeoutMS: 45000,
    });
    console.log('MongoDB connected');
  } catch (error) {
    console.error('MongoDB connection error:', error);
    process.exit(1);
  }
};
 
// Handle connection events
mongoose.connection.on('error', err => {
  console.error('MongoDB error:', err);
});
 
mongoose.connection.on('disconnected', () => {
  console.warn('MongoDB disconnected. Attempting reconnect...');
});
 
// Graceful shutdown
process.on('SIGINT', async () => {
  await mongoose.connection.close();
  console.log('MongoDB connection closed due to app termination');
  process.exit(0);
});

How do you handle transactions in MongoDB?

MongoDB supports multi-document ACID transactions since version 4.0 for replica sets and 4.2 for sharded clusters. Transactions ensure that multiple operations either all succeed or all fail together. Use the withTransaction() helper for automatic retry on transient errors.

Design your schema to minimize transaction needs by embedding related data. Single-document operations are always atomic in MongoDB, so well-designed schemas often eliminate the need for transactions entirely.

// Multi-document transaction (MongoDB 4.0+)
const session = await mongoose.startSession();
 
try {
  session.startTransaction();
 
  // All operations use the same session
  const user = await User.create([{ name: 'John', balance: 1000 }], { session });
 
  await Account.findByIdAndUpdate(
    fromAccountId,
    { $inc: { balance: -100 } },
    { session }
  );
 
  await Account.findByIdAndUpdate(
    toAccountId,
    { $inc: { balance: 100 } },
    { session }
  );
 
  // Commit if all succeeded
  await session.commitTransaction();
} catch (error) {
  // Rollback on any error
  await session.abortTransaction();
  throw error;
} finally {
  session.endSession();
}
 
// Using withTransaction helper (recommended)
await session.withTransaction(async () => {
  await Account.findByIdAndUpdate(fromId, { $inc: { balance: -100 } }, { session });
  await Account.findByIdAndUpdate(toId, { $inc: { balance: 100 } }, { session });
});

Scaling and Performance Questions

Understanding how to scale MongoDB and optimize performance is crucial for senior-level interviews.

How do you handle the N+1 query problem in MongoDB?

The N+1 problem occurs when you fetch a list of items and then query each item's related data separately, resulting in N+1 database queries. MongoDB provides several solutions: embedding related data eliminates the need for additional queries, Mongoose's populate() batches reference lookups, the $lookup aggregation stage performs server-side joins, and the DataLoader pattern batches and caches lookups at the application layer.

How do you scale MongoDB?

MongoDB scales through multiple mechanisms. Replica sets provide read scaling and high availability by replicating data across primary and secondary nodes. Sharding distributes data across multiple servers for horizontal write scaling. Read preferences can route read operations to secondary nodes to distribute load.

For most applications, a properly indexed single replica set handles millions of documents efficiently. Sharding is typically only necessary when data volume exceeds a single server's capacity or when you need to distribute writes across geographic regions.


Quick Reference

SQLMongoDBMongoose
TableCollectionModel
RowDocumentDocument instance
ColumnFieldSchema field
Primary Key_id (ObjectId)_id
Foreign KeyReference (ObjectId)ref + populate()
JOIN$lookup / populate.populate()
GROUP BY$group.aggregate()
INDEXcreateIndex()schema.index()
Transactionsession.withTransaction()session.withTransaction()

If you found this helpful, check out these related guides:

Ready to ace your interview?

Get 550+ interview questions with detailed answers in our comprehensive PDF guides.

View PDF Guides