βΆMachine Learning vs AI vs Deep Learning β what's the difference?
AI is the broad field of intelligent machines. Machine Learning is a subset where systems improve from data without explicit programming. Deep Learning is a subset of ML using neural networks with multiple layers. All Deep Learning is ML, but not all ML is Deep Learning. In 2026: 70% of job postings say 'AI/ML' interchangeably, but deep learning (LLMs, transformers) dominates salary. When hiring says 'AI role', assume LLM/DL focus.
βΆWhen does traditional ML (random forests, XGBoost) beat deep learning?
Classical ML wins when: (1) you have < 10k samples (DL needs more), (2) features are domain-expert engineered (XGBoost = feature alchemy), (3) interpretability matters (financial/medical), (4) latency is tight and you can't run transformers on edge, (5) the relationship is simple and linear. Real ratio: 80% of production ML is XGBoost/LightGBM, 20% is deep learning. DL gets the hype, classical ML gets the paychecks.
βΆSupervised vs unsupervised vs reinforcement learning β how do I know which to use?
Supervised: you have labeled data (X β Y), goal is prediction (regression/classification). Example: predict house price from features. 60% of real jobs. Unsupervised: no labels, goal is finding structure (clustering, dimensionality reduction). Example: segment customers by behavior. 20% of jobs. Reinforcement learning: agent learns by trial-and-error, reward signal guides it. Example: game AI, robotics. 10% of jobs (mostly research/gaming). Rule: always start supervised if you have labels; unsupervised for discovery; RL for interactive systems.
βΆHow do I pick evaluation metrics β accuracy isn't always right?
Accuracy is a trap. Classify by use case: (1) balanced classes (accuracy OK), (2) imbalanced (use precision/recall/F1 or AUC-ROC), (3) ranking (NDCG, MRR), (4) recommendation (hit rate, RMSE), (5) NLP (BLEU, ROUGE). For binary classification: if false positives are expensive (medical diagnosis), optimize precision; if false negatives are expensive (fraud detection), optimize recall. ROC-AUC is the safe choice. Most failures = wrong metric, not wrong model.
βΆHow do I prevent overfitting β my model works on training data but fails in production?
Overfitting = memorizing training data, failing on new data. Fixes ranked by impact: (1) more data (gather 2-10x more), (2) regularization (L1/L2 penalty, dropout, early stopping), (3) simpler model (smaller network, fewer features), (4) cross-validation (never tune on test set). For deep learning: use early stopping on validation loss, drop 10-50% of neurons via dropout, use batch normalization. For classical ML: tune regularization strength C via grid search. Data > model tuning always.
βΆGPU vs TPU β when do I need specialized hardware and how much does it cost?
CPU is fine for training on < 1M samples. GPU (NVIDIA A100/H100) = 10-100x faster for matrix ops; worth it when training takes > 1 hour. TPU = Google's custom silicon, 2-5x faster than GPU but locked to Tensor/JAX, rarely available. Pricing: GPU $1-3/hour cloud, TPU $4-8/hour cloud. Rule: start CPU/Colab free, upgrade to GPU when training > 1 hour, use TPU for large research labs only. Most jobs use V100/A100, not cutting-edge.
βΆHow do I get my first ML role without a PhD?
MLEs don't need PhDs; 70% have undergrad degrees. Path: (1) learn Python + stats fundamentals (2-3 months free), (2) build 3-5 portfolio projects (Kaggle, real datasets), deploy one model to production, (3) contribute to open-source ML libraries or write blog posts, (4) apply to 'ML Engineer L1' roles focusing on production (not research). Talk about deployed models, not papers. Companies hire for shipping, not publishing. Data science = grad school jobs, ML engineering = undergrad jobs.