Skip to main content
JobCannon
All skills

Database Design & Architecture

⬢ TIER 2Tech
High
Salary impact
8 months
Time to learn
Hard
Difficulty
6
Careers
TL;DR

Database Design is the art and science of structuring data for optimal performance, consistency, and scale. Foundation skill for backend engineers and data architects. Learn schema normalization (1NF through BCNF), relational vs NoSQL tradeoffs, sharding/partitioning strategies, and real-world patterns (multi-tenant, time-series, document stores). Career path: Junior DBA ($85-115k) → Senior Backend ($115-155k) → Data Architect ($155-200k+) over 6-9 months. Salary impact: +$25k-$60k (backend/data/platform roles). Tools: PostgreSQL, MySQL, MongoDB, DynamoDB, Cassandra, Redis, dbdiagram.io, Prisma, Drizzle.

What is Database Design & Architecture

Design scalable database schemas, choose SQL vs NoSQL, optimize queries. Normalize data, handle migrations, plan for scale. Foundation for all data-driven applications. Learning Curve: Medium-Hard (theory + practical trade-offs)

đź”§ TOOLS & ECOSYSTEM
PostgreSQLMySQLMongoDBDynamoDBCassandraRedisdbdiagram.iodrawSQLLucidchartMermaidPrismaDrizzle

đź“‹ Before you start

đź’° Salary by region

RegionJuniorMidSenior
USA$95k$135k$185k
UKÂŁ55kÂŁ80kÂŁ115k
EU€60k€85k€125k
CANADAC$105kC$145kC$195k

âť“ FAQ

Relational vs NoSQL — when should I pick which for my application?
Relational (PostgreSQL, MySQL): ACID guarantees, complex queries, normalize data, consistency first. Pick for multi-table relationships (e-commerce, banking, SaaS). NoSQL (MongoDB, DynamoDB): eventual consistency, horizontal scale, document/key-value storage. Pick for unstructured data, high throughput, schemaless flexibility. Hybrid: use both — relational for transactions, NoSQL for logs/caches/events. 2026 trend: PostgreSQL wins for most apps; NoSQL reserved for specific use cases (real-time analytics, IoT, social feeds).
What are the normalization tradeoffs, and when should I denormalize?
Normalization (1NF → 3NF → BCNF) eliminates redundancy, saves space, ensures consistency. Downside: complex joins slow queries. Denormalize when: frequent reads of denormalized data (reporting), joins are expensive, you control consistency (event-sourcing), non-critical data. Real example: e-commerce—normalize product catalog, denormalize customer orders (save address snapshot to avoid broken links). Monitor query plans; denormalize only if profiler shows join bottleneck.
Should I use JSON columns in PostgreSQL instead of a separate table?
JSON columns (jsonb type) shine for: flexible schema (user metadata, webhooks), nested data you rarely filter by, semi-structured data. Avoid if: you query deeply into the JSON, need ACID guarantees on JSON fields (use separate tables), data is highly relational. Best practice: jsonb + indexes (e.g., GIN index) for fast querying. PostgreSQL jsonb is 70% replacement for NoSQL; most apps stay relational + jsonb hybrid.
How do I design a multi-tenant SaaS database schema?
Two patterns: (1) Row-Level Security (RLS)—one table per entity, add tenant_id column, use RLS policies to isolate data per tenant. Cheaper, simpler queries, shared infrastructure risk. (2) Separate schemas—one schema per tenant, same table structure, complete isolation, higher maintenance. Pick RLS for <100 tenants, separate schemas for high-security/regulatory isolation. Ensure indexes on tenant_id + query columns; avoid SELECT * queries that bypass tenant filtering.
What's the difference between sharding and partitioning?
Partitioning: split one table by range (e.g., date ranges for logs), list (enum), or hash—same database. Faster queries on specific partitions, easier pruning. Sharding: split data across multiple databases/servers by hash—horizontal scale, no single-server bottleneck. Pick partitioning for single-server scale-out (PostgreSQL partitions). Pick sharding when one server can't handle volume. Both require app-side logic or proxy (e.g., Django ORM, Prisma, Citus).
How do I handle zero-downtime migrations on production databases?
Backward-compatible steps: (1) add new column + backfill in batches, (2) dual-write old + new, (3) migrate reads to new column, (4) remove old column. Use tools: Prisma migrations (versioned), Flyway, Liquibase. Test on staging first. For >1GB tables, use online migration tools (percona-xtrabackup, pg_repack) to avoid locks. Always checkpoint every 1M rows backfilled; if something breaks, rollback that checkpoint.
What indexes should I add to optimize slow queries?
Use EXPLAIN to find seq scans and slow joins. Add indexes on: (1) WHERE columns, (2) JOIN keys, (3) ORDER BY / GROUP BY columns. Avoid: wide indexes (>4 cols), indexes on low-cardinality columns, indexes on UPDATE-heavy columns (slows writes). Composite indexes: order matters (equality first, then range, then sort). Partial indexes: WHERE clause to skip null/archived rows. Trade-off: faster reads, slower writes; monitor disk usage and query time weekly.

Not sure this skill is for you?

Take a 10-min Career Match — we'll suggest the right tracks.

Find my best-fit skills →

Find your ideal career path

Skill-based matching across 2,536 careers. Free, ~10 minutes.

Take Career Match — free →