Question 1

When should I use MongoDB instead of PostgreSQL?

Accepted Answer

Use MongoDB when: (1) your data structure evolves frequently (schema migrations are expensive in SQL); (2) you have hierarchical/nested data (documents map naturally to objects); (3) you need horizontal scaling beyond single-server PostgreSQL; (4) your read patterns are highly denormalized (avoid JOIN complexity). Use PostgreSQL when: your data is highly relational, ACID guarantees across multiple tables matter, or your data structure is stable and well-defined. MongoDB excels for content management, user profiles, IoT sensor logs, real-time analytics. PostgreSQL excels for transactional systems (banking, e-commerce orders, inventory). Hybrid: use both — PostgreSQL for transactional core, MongoDB for logs/cache/user activity.

Question 2

How do I design documents efficiently for MongoDB?

Accepted Answer

Three patterns: (1) Embed related data in one document if accessed together (e.g., user profile + address + preferences = one doc). Limits: 16MB max per document, so don't embed unbounded arrays. (2) Reference data with ObjectIds if it's updated independently (e.g., users reference order IDs, not embedding full orders). (3) Hybrid: embed hot data (user name, email), reference cold data (full order history). Use dot-notation to query embedded fields: `db.users.find({'address.city': 'NYC'})`. Profile your queries first, then denormalize intentionally — over-normalization defeats MongoDB's strengths.

Question 3

What's the aggregation pipeline and when do I use it?

Accepted Answer

Aggregation pipeline = multi-stage data transformation (map-reduce-filter-sort). Stages: $match (filter), $project (select fields), $group (aggregate), $sort (order), $lookup (join), $unwind (explode arrays), $facet (multi-dimensional output). Use for: complex analytics (sum revenue by region, count users per cohort), data transformation before sending to client, or replacing client-side processing. Example: `db.orders.aggregate([{$match: {date: {$gt: startDate}}}, {$group: {_id: '$customerId', total: {$sum: '$amount'}}}])` gets total revenue per customer. Pipeline is faster than fetching raw docs and processing in code because filtering/grouping happens server-side. Limit: pipeline can't modify documents (use updateMany for that).

Question 4

How do I optimize queries — indexing strategy?

Accepted Answer

Index = sorted B-tree copy of a field, speeds up queries but slows writes. Strategy: (1) Use `db.collection.find().explain('executionStats')` to see how many documents a query scans. Scan count >> returned documents = missing index. (2) Index your most common query filters first (e.g., if 90% of queries filter by userId, index userId). (3) Compound indexes match query shape: query `{userId: 1, createdAt: -1}` matches index `{userId: 1, createdAt: -1}` exactly (order matters for sort). (4) Prefix rule: index `{a, b, c}` is used for `{a}`, `{a, b}`, or `{a, b, c}` queries but NOT `{b, c}` alone. (5) Avoid over-indexing — each index slows inserts. Monitor with `db.collection.stats()` to see index size and query patterns. Atlas has performance advisor that suggests indexes.

Question 5

What's the difference between transactions and atomicity in MongoDB?

Accepted Answer

Single-document atomicity = updates to one document are atomic (all-or-nothing). Multi-document transactions (v4.0+) = multiple operations across multiple documents treated as atomic unit. Use transactions when: updating a user balance AND logging the transaction (both must succeed or both fail). Syntax: `session.startTransaction()`, execute ops, `session.commitTransaction()` or `session.abortTransaction()`. Cost: transactions are slower (2-5x) than single-document updates and lock resources. Best practice: design documents to minimize transaction need (embed related data instead). Transactions on sharded clusters have higher latency — avoid if possible.

Question 6

How do I handle data consistency and replication?

Accepted Answer

Replica sets = primary (accepts writes) + secondaries (read replicas). Write concern (w=1 default, w='majority' stronger) controls how many replicas must acknowledge a write before returning success. Read preference = primary (default, consistent), primaryPreferred (read from secondary if primary down), or secondary (read-only replicas for analytics). For high consistency: use `{w: 'majority'}` writes (slower but safer). For high throughput: use `w=1` (fast but risky if primary fails before secondary replicates). Change streams = listen for real-time updates: `collection.watch([{$match: {operationType: 'insert'}}])` fires callbacks on new inserts. Use for: microservices notifications, syncing to search indexes, real-time dashboards.

Question 7

What are common MongoDB mistakes and how do I avoid them?

Accepted Answer

Mistake 1: Using MongoDB for highly relational data (10+ JOINs) → slower than SQL. Fix: normalize in PostgreSQL. Mistake 2: Embedding unbounded arrays (e.g., all comments in a post) → document bloats to 16MB. Fix: store comment IDs in post, fetch separately. Mistake 3: No indexes on filter fields → collection scans 100% of documents. Fix: run explain(), add indexes. Mistake 4: Ignoring write concern → data loss if primary crashes. Fix: use `w='majority'`. Mistake 5: Denormalizing everything → update anomalies (change user name in 1000 docs). Fix: reference instead of embed for mutable data. Mistake 6: Not using projection → fetching unnecessary fields. Fix: `.find({}, {_id: 1, name: 1})` to fetch only id + name. Test data consistency with read-your-own-write semantics: write then immediately read to verify.

Region	Junior	Mid	Senior
USA	$80k	$125k	$180k
UK	£50k	£80k	£115k
EU	€55k	€88k	€125k
CANADA	C$85k	C$130k	C$190k

MongoDB

What is MongoDB

📋 Before you start

💰 Salary by region

🎓 Certifications

🎯 Careers using MongoDB

⚖ Compare with

❓ FAQ

Not sure this skill is for you?

Find your ideal career path