Skip to main content
JobCannon
All skills

SQL & Databases

Query, model, and optimize data storage systems

β¬’ TIER 2Tech
+$20k-
Salary impact
6 months
Time to learn
Medium
Difficulty
8
Careers
TL;DR

SQL = universal language for querying relational databases. Career path: Junior analyst (SELECT, WHERE, basic JOINs, $60-85k) β†’ Mid-level engineer (window functions, CTEs, indexes, query optimization, $90-130k) β†’ Senior architect (replication, sharding, database design, $130-200k+) over 3-6 months learning time. Required for data analysts, engineers, product managers, and anyone touching data. 80%+ of companies use SQL databases.

What is SQL & Databases

SQL (Structured Query Language) is the universal language for working with relational databases. Despite being over 50 years old, SQL remains essential in 2026β€”80%+ of companies use SQL databases, and SQL skills add $20k-$50k to salaries across tech, data, and business roles. - Data is everywhere: Every app needs to store data

πŸ”§ TOOLS & ECOSYSTEM
PostgreSQLMySQLSQL ServerSQLiteBigQuerySnowflakeRedshiftJOINswindow functionsCommon Table Expressions (CTEs)indexes

πŸ’° Salary by region

RegionJuniorMidSenior
USA$65k$110k$160k
UKΒ£45kΒ£75kΒ£110k
EU€50k€80k€120k
CANADAC$70kC$115kC$170k

βš– Compare with

❓ FAQ

What's the difference between JOINs (INNER, LEFT, RIGHT) and when do I use each?
INNER JOIN returns rows where condition matches in both tables (intersection). LEFT JOIN returns all rows from left table + matching rows from right table (nulls if no match). RIGHT JOIN is the reverse. Use INNER when you only want matched records (e.g., users with orders). Use LEFT when you want all from one table and optional data from another (e.g., all users, even without orders). Avoid RIGHT JOIN β€” rewrite as LEFT JOIN with swapped tables for clarity. Example: SELECT users.name, orders.total FROM users LEFT JOIN orders ON users.id = orders.user_id returns all users and their orders (or null if no orders).
How do window functions (ROW_NUMBER, RANK, LAG) unlock complex analytics?
Window functions perform calculations over a set of rows (partition) without collapsing them like GROUP BY does. ROW_NUMBER assigns unique numbers per partition (useful for deduplication or ranking). RANK allows ties (12, 12, 14 if two rows tied). LAG/LEAD access previous/next row values (useful for comparisons like month-over-month growth). Example: SUM(revenue) OVER (PARTITION BY user_id ORDER BY date) calculates cumulative revenue per user without grouping. These unlock analytics like cohort retention, funnel steps per user, running totals. Learn these to 10x your SQL impact.
What are indexes and why do they matter for query performance?
Index = sorted copy of a column (B-tree structure) that speeds up lookups but slows writes. Without an index, queries scan every row (slow on 100M row tables). With an index, lookup is logarithmic (~30 disk reads for 1B rows vs 1B scans). Strategy: (1) Use EXPLAIN ANALYZE to see how many rows a query scans. (2) If scans >> returned rows, you're missing an index. (3) Index your WHERE conditions first (most common filters). (4) Compound indexes match query shape (order matters). (5) Avoid over-indexing (each slows inserts). Example: CREATE INDEX idx_users_email ON users(email) speeds up WHERE email = '[email protected]' dramatically. Monitor index bloat and rebuild periodically.
SQL vs NoSQL β€” when do I choose each?
SQL (PostgreSQL, MySQL): structured data, ACID transactions (all-or-nothing writes), complex queries with JOINs, data integrity via foreign keys. Use for banking, e-commerce, CRM, anything relational. NoSQL (MongoDB, Redis): flexible schemas, horizontal scaling, eventual consistency, nested data. Use for logs, caches, documents, IoT sensors. Hybrid approach: PostgreSQL for transactional core (users, orders, payments), MongoDB for logs/profiles, Redis for sessions. Choose SQL by default unless you have a specific reason (scale beyond single server, unstructured data, high write throughput).
What are CTEs (Common Table Expressions) and how do they improve query readability?
CTE = temporary named result set (WITH clause) that makes complex queries readable. Instead of nesting subqueries (hard to parse), you build step-by-step. Example: WITH monthly_revenue AS (SELECT DATE_TRUNC('month', date) month, SUM(amount) revenue FROM sales GROUP BY month) SELECT * FROM monthly_revenue WHERE revenue > 100000. CTEs are also useful for recursion (hierarchical data like org charts). Benefits: (1) Reads top-to-bottom like pseudocode. (2) Easier debugging (test each CTE independently). (3) Reuse the CTE multiple times in the same query. Learn CTEs to write queries others can understand.
What's the difference between normalization and denormalization, and when do I denormalize?
Normalization = breaking data into separate tables to eliminate redundancy (3NF: no transitive dependencies). Example: store city as foreign key, not in every user row. Prevents update anomalies (change city name once, not 1M times). Denormalization = storing redundant data (e.g., user name in orders table) to speed up queries by avoiding JOINs. Trade-off: Normalized = slower reads (more JOINs), faster writes. Denormalized = faster reads, slower writes, data consistency issues. Best practice: Normalize your operational database (production), denormalize for analytics (data warehouse, BI tools). Most companies use both: PostgreSQL for transactional OLTP, Snowflake/BigQuery for analytical OLAP (denormalized fact tables).

Not sure this skill is for you?

Take a 10-min Career Match β€” we'll suggest the right tracks.

Find my best-fit skills β†’

Find your ideal career path

Skill-based matching across 2,536 careers. Free, ~10 minutes.

Take Career Match β€” free β†’