SQL JOINs Explained: The Interview Question That Tests Your Database Skills

·13 min read
sqlinterview-questionsdatabasejoinsbackend

SQL JOINs appear in 94% of database-related interview questions, yet "LEFT JOIN vs INNER JOIN" remains the #1 most-searched SQL question on Stack Overflow. The gap between writing JOINs that work and understanding what they actually return is exactly what interviewers probe for. Here's how to nail this question.

The 30-Second Answer

When the interviewer asks "What's the difference between JOIN types?", here's your concise answer:

"INNER JOIN returns only matching rows from both tables - like the intersection in a Venn diagram. LEFT JOIN returns all rows from the left table plus matches from the right - unmatched rows get NULL values. RIGHT JOIN is the opposite. FULL OUTER JOIN returns all rows from both tables. I use INNER when I only want matched data, LEFT when I need all records from one table regardless of matches."

That's it. Wait for follow-up questions.

The 2-Minute Answer (If They Want More)

If they ask you to elaborate:

"Think of two tables: users and orders. A user might have many orders, some orders, or no orders.

INNER JOIN only shows users who have placed orders. Users without orders are excluded entirely.

LEFT JOIN shows all users. Users with orders show their order data; users without orders show NULL in the order columns. This is useful for finding users who haven't ordered.

RIGHT JOIN shows all orders. Orders with users show user data; orders with deleted or missing users show NULL. Less common but useful for data audits.

FULL OUTER JOIN shows everything - users without orders AND orders without users. Great for finding orphaned records.

The choice depends on what question you're answering. 'Show me order totals per user' might be INNER. 'Show me all users and their orders if any' needs LEFT."

Visual Explanation (Draw This in Interviews)

Sample Data

USERS                    ORDERS
+----+--------+          +----+---------+--------+
| id | name   |          | id | user_id | amount |
+----+--------+          +----+---------+--------+
| 1  | Alice  |          | 1  | 1       | 100    |
| 2  | Bob    |          | 2  | 1       | 150    |
| 3  | Carol  |          | 3  | 3       | 200    |
+----+--------+          +----+---------+--------+

Alice has 2 orders, Bob has 0 orders, Carol has 1 order

INNER JOIN

SELECT users.name, orders.amount
FROM users
INNER JOIN orders ON users.id = orders.user_id;
+--------+--------+
| name   | amount |
+--------+--------+
| Alice  | 100    |
| Alice  | 150    |
| Carol  | 200    |
+--------+--------+
-- Bob excluded (no matching orders)

LEFT JOIN

SELECT users.name, orders.amount
FROM users
LEFT JOIN orders ON users.id = orders.user_id;
+--------+--------+
| name   | amount |
+--------+--------+
| Alice  | 100    |
| Alice  | 150    |
| Bob    | NULL   |  <- Bob included with NULL
| Carol  | 200    |
+--------+--------+

RIGHT JOIN

SELECT users.name, orders.amount
FROM users
RIGHT JOIN orders ON users.id = orders.user_id;
-- Same as LEFT JOIN in this case (all orders have valid users)
-- If an order had user_id = 99 (doesn't exist):
+--------+--------+
| name   | amount |
+--------+--------+
| Alice  | 100    |
| Alice  | 150    |
| Carol  | 200    |
| NULL   | 75     |  <- Order with no matching user
+--------+--------+

FULL OUTER JOIN

SELECT users.name, orders.amount
FROM users
FULL OUTER JOIN orders ON users.id = orders.user_id;
+--------+--------+
| name   | amount |
+--------+--------+
| Alice  | 100    |
| Alice  | 150    |
| Bob    | NULL   |  <- User without orders
| Carol  | 200    |
| NULL   | 75     |  <- Order without user
+--------+--------+
-- Note: MySQL doesn't support FULL OUTER JOIN directly

The Classic Problem: Employee-Manager Self-Join

This question appears in almost every SQL interview:

Interviewer: "You have an employees table with id, name, and manager_id. Write a query to show each employee with their manager's name."

The Schema

CREATE TABLE employees (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    manager_id INT  -- References employees.id
);
 
INSERT INTO employees VALUES
(1, 'Alice', NULL),    -- CEO, no manager
(2, 'Bob', 1),         -- Reports to Alice
(3, 'Carol', 1),       -- Reports to Alice
(4, 'Dave', 2),        -- Reports to Bob
(5, 'Eve', 2);         -- Reports to Bob

The Solution

-- Basic self-join
SELECT
    e.name AS employee,
    m.name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;
+----------+---------+
| employee | manager |
+----------+---------+
| Alice    | NULL    |  <- CEO has no manager
| Bob      | Alice   |
| Carol    | Alice   |
| Dave     | Bob     |
| Eve      | Bob     |
+----------+---------+

Follow-Up: Show the Full Hierarchy

-- Employees with their manager and their manager's manager
SELECT
    e.name AS employee,
    m.name AS manager,
    mm.name AS managers_manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id
LEFT JOIN employees mm ON m.manager_id = mm.id;
+----------+---------+------------------+
| employee | manager | managers_manager |
+----------+---------+------------------+
| Alice    | NULL    | NULL             |
| Bob      | Alice   | NULL             |
| Carol    | Alice   | NULL             |
| Dave     | Bob     | Alice            |
| Eve      | Bob     | Alice            |
+----------+---------+------------------+

Practical JOIN Patterns

1. Find Records with No Match (Anti-Join)

-- Find users who have never placed an order
SELECT u.name
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE o.id IS NULL;
 
-- Alternative using NOT EXISTS (often faster)
SELECT u.name
FROM users u
WHERE NOT EXISTS (
    SELECT 1 FROM orders o WHERE o.user_id = u.id
);

2. Find Duplicates

-- Find duplicate email addresses
SELECT a.email, a.id, b.id
FROM users a
JOIN users b ON a.email = b.email AND a.id < b.id;

3. Join Multiple Tables

-- Orders with user and product information
SELECT
    u.name AS customer,
    p.name AS product,
    oi.quantity,
    o.created_at
FROM orders o
JOIN users u ON o.user_id = u.id
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id
WHERE o.created_at > '2024-01-01';

4. Aggregate with JOINs

-- Total spent per user (including users with no orders)
SELECT
    u.name,
    COALESCE(SUM(o.amount), 0) AS total_spent
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.name
ORDER BY total_spent DESC;

5. Find the Latest Record Per Group

-- Latest order for each user
SELECT u.name, o.amount, o.created_at
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.created_at = (
    SELECT MAX(o2.created_at)
    FROM orders o2
    WHERE o2.user_id = u.id
);
 
-- Alternative with window functions (better performance)
SELECT name, amount, created_at
FROM (
    SELECT
        u.name,
        o.amount,
        o.created_at,
        ROW_NUMBER() OVER (PARTITION BY u.id ORDER BY o.created_at DESC) AS rn
    FROM users u
    JOIN orders o ON u.id = o.user_id
) ranked
WHERE rn = 1;

The ON vs WHERE Trap

This is a common interview gotcha:

-- These return DIFFERENT results with LEFT JOIN!
 
-- Query 1: Condition in ON
SELECT u.name, o.amount
FROM users u
LEFT JOIN orders o ON u.id = o.user_id AND o.amount > 100;
 
-- Query 2: Condition in WHERE
SELECT u.name, o.amount
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE o.amount > 100;

Query 1 Result:

+--------+--------+
| name   | amount |
+--------+--------+
| Alice  | 150    |  <- Alice's 100 order excluded from match
| Bob    | NULL   |  <- Still shown (all left rows)
| Carol  | 200    |
+--------+--------+

Query 2 Result:

+--------+--------+
| name   | amount |
+--------+--------+
| Alice  | 150    |
| Carol  | 200    |
+--------+--------+
-- Bob excluded because NULL > 100 is FALSE

The rule: ON affects which rows match. WHERE filters after the join. For LEFT JOINs, put join relationships in ON and filtering in WHERE - unless you specifically want the behavior of Query 1.

CROSS JOIN and Cartesian Products

-- Generate all size-color combinations
SELECT sizes.name AS size, colors.name AS color
FROM sizes
CROSS JOIN colors;
 
-- Equivalent explicit syntax
SELECT sizes.name, colors.name
FROM sizes, colors;
 
-- Generate a date series (useful for reports)
SELECT dates.date, COALESCE(COUNT(o.id), 0) AS order_count
FROM (
    SELECT DATE('2024-01-01') + INTERVAL n DAY AS date
    FROM (SELECT 0 n UNION SELECT 1 UNION SELECT 2 /* ... */) numbers
) dates
LEFT JOIN orders o ON DATE(o.created_at) = dates.date
GROUP BY dates.date;

Common Follow-Up Questions

"What's the difference between JOIN and UNION?"

"JOIN combines tables horizontally - adding columns from related tables based on a key. UNION combines queries vertically - stacking rows from multiple SELECT statements.

JOIN needs a relationship between tables. UNION needs queries with the same column structure."

-- JOIN: More columns
SELECT u.name, o.amount
FROM users u
JOIN orders o ON u.id = o.user_id;
 
-- UNION: More rows
SELECT name, email FROM customers
UNION
SELECT name, email FROM leads;
-- Removes duplicates; use UNION ALL to keep them

"How do you find records that exist in one table but not another?"

"Use LEFT JOIN with NULL check, NOT EXISTS, or NOT IN. LEFT JOIN is often clearest, NOT EXISTS is usually fastest."

-- Method 1: LEFT JOIN + NULL check
SELECT a.*
FROM table_a a
LEFT JOIN table_b b ON a.id = b.a_id
WHERE b.id IS NULL;
 
-- Method 2: NOT EXISTS (often faster)
SELECT a.*
FROM table_a a
WHERE NOT EXISTS (SELECT 1 FROM table_b b WHERE b.a_id = a.id);
 
-- Method 3: NOT IN (careful with NULLs!)
SELECT a.*
FROM table_a a
WHERE a.id NOT IN (SELECT b.a_id FROM table_b b WHERE b.a_id IS NOT NULL);

"How do you handle NULL values in JOINs?"

"NULL never equals anything, including NULL. So NULL = NULL is FALSE in SQL. This matters for JOINs - rows with NULL in the join column won't match.

If you need to match NULLs, use COALESCE or IS NOT DISTINCT FROM (in PostgreSQL)."

-- NULLs won't match
SELECT * FROM a JOIN b ON a.value = b.value;
-- Rows where value IS NULL won't join
 
-- To match NULLs:
SELECT * FROM a JOIN b ON a.value = b.value OR (a.value IS NULL AND b.value IS NULL);
 
-- PostgreSQL: IS NOT DISTINCT FROM treats NULLs as equal
SELECT * FROM a JOIN b ON a.value IS NOT DISTINCT FROM b.value;

"When would you use a subquery instead of a JOIN?"

"Subqueries are useful when:

  • You need aggregation before joining
  • You're checking existence (EXISTS/NOT EXISTS)
  • You need values from a correlated lookup

JOINs are usually better for combining data from multiple tables. Modern optimizers often convert subqueries to JOINs anyway."

What Interviewers Are Really Testing

When I ask about JOINs, I'm checking:

  1. Conceptual understanding - Can you visualize what each JOIN returns?
  2. Practical application - Can you choose the right JOIN for a scenario?
  3. Edge cases - Do you understand NULL handling and the ON vs WHERE difference?
  4. Complex patterns - Can you write self-joins and multi-table queries?
  5. Performance awareness - Do you know about indexing and query optimization?

A candidate who draws the Venn diagram, solves the employee-manager problem, and mentions indexing will stand out.

Quick Reference Card

JOIN TypeReturns
INNER JOINOnly matching rows from both tables
LEFT JOINAll left rows + matching right rows (NULL if no match)
RIGHT JOINAll right rows + matching left rows (NULL if no match)
FULL OUTERAll rows from both tables (NULL where no match)
CROSS JOINCartesian product - every row combination
Self-JOINTable joined with itself (uses aliases)
PatternUse Case
LEFT JOIN + IS NULLFind records with no match
Self-joinHierarchical data, comparisons
GROUP BY with JOINAggregates across related data
Multiple JOINsCombine 3+ tables

Practice Questions

Test yourself before your interview:

1. Given tables products and categories, write a query to show all products with their category name, including products with no category.

2. Find all customers who have placed more than 3 orders.

3. Find pairs of employees who work in the same department.

4. What's the result of this query?

SELECT a.x, b.y
FROM A a
LEFT JOIN B b ON a.id = b.id
WHERE b.y = 'test';

Answers:

SELECT p.name, c.name AS category
FROM products p
LEFT JOIN categories c ON p.category_id = c.id;
SELECT c.name, COUNT(o.id) AS order_count
FROM customers c
JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name
HAVING COUNT(o.id) > 3;
SELECT e1.name, e2.name, e1.department
FROM employees e1
JOIN employees e2 ON e1.department = e2.department AND e1.id < e2.id;
  1. This filters out rows where b.y is NULL, effectively converting the LEFT JOIN to an INNER JOIN. Rows from A without a match in B are excluded because NULL = 'test' is FALSE.


Related Articles

If you found this helpful, check out these related guides:

Ready for More SQL Interview Questions?

This is just one topic from our complete SQL interview prep guide. Get access to 50+ SQL questions covering:

  • Subqueries and CTEs
  • Window functions
  • Indexes and query optimization
  • Transactions and locking
  • Database design and normalization

Get Full Access to All SQL Questions →

Or try our free SQL preview to see more questions like this.


Written by the EasyInterview team, based on real interview experience from 12+ years in tech and hundreds of technical interviews conducted at companies like BNY Mellon, UBS, and leading fintech firms.

Ready to ace your interview?

Get 550+ interview questions with detailed answers in our comprehensive PDF guides.

View PDF Guides