Question 1

Transformers vs RNNs/LSTMs — why did transformers win?

Accepted Answer

RNNs (LSTM, GRU) process sequences one token at a time, can't parallelize, forget long-range context. Transformers (BERT, GPT) process all tokens in parallel via self-attention, capture long-range dependencies, train 10-100x faster. Attention is All You Need (2017) proved it. By 2020: transformers = industry standard. Learning RNNs = learning history; building production NLP = transformers only.

Question 2

BERT vs GPT — which should I learn first?

Accepted Answer

BERT = bidirectional, pretrained on masked language modeling, best for classification/understanding tasks (sentiment, entity extraction, Q&A). GPT = unidirectional (left-to-right), pretrained on causal language modeling, best for generation (chatbots, summarization, translation). Learn BERT first (conceptually simpler, https://huggingface.co/course/chapter1), then GPT. In 2026: fine-tune BERT for custom classifiers, use GPT for chat/generation via API.

Question 3

Fine-tuning vs RAG (Retrieval Augmented Generation) vs prompt engineering — when to use each?

Accepted Answer

Prompt engineering = free, fast, no training (try first). RAG = retrieve relevant documents from a vector database, feed to LLM context, good for knowledge-intensive tasks (Q&A over company docs), no retraining. Fine-tuning = expensive (GPUs), slow (hours-days), but fits the model to your data/style. Pick: (1) try prompt engineering, (2) if context window insufficient, add RAG, (3) if LLM still fails, fine-tune. 80% of use cases = prompt engineering + RAG.

Question 4

How do I host/deploy an LLM myself vs using an API?

Accepted Answer

API (OpenAI, Anthropic) = $0.01-0.1 per 1k tokens, lowest latency, no infra. Self-host small LLMs (Llama 7B, Mistral) = $0.50-5/hour GPU, latency 100-500ms, full control, privacy. Rule: API for prototyping and scaling (chat, content generation), self-host for privacy-critical apps (healthcare, finance) or if volume > 10M tokens/month. 2026 trend: smaller specialized models (MistralAI) on your infrastructure, not giant models via API.

Question 5

Vector databases and embeddings — Pinecone vs Weaviate vs building DIY?

Accepted Answer

Embeddings = convert text to numbers (768-1536 dims), enable semantic search. Pinecone = managed (easiest, $0.10-1/month), Weaviate = self-host (free, complex), DIY = index with NumPy (only for <10k docs). For production: Pinecone if budget available, Weaviate if on-premise required, DIY only for prototypes. All use sentence-transformers (all-MiniLM-L6-v2) for encoding and cosine similarity for retrieval.

Question 6

Multilingual NLP — how hard is it to support multiple languages?

Accepted Answer

Monolingual models (English BERT) fail on other languages. Solutions: (1) multilingual BERT (mBERT, XLM-RoBERTa) for 100+ langs, lower quality, (2) language-specific models (French BERT, German BERT) for top langs, better quality, (3) translate to English (lossy but works). Recommendation: mBERT for MVP, switch to language-specific for each supported language in production. Don't try to build language-universal; use existing multilingual checkpoints from Hugging Face.

Question 7

How do I evaluate NLP models — metrics beyond accuracy?

Accepted Answer

Classification: precision/recall/F1 (imbalanced), ROC-AUC (ranking). Generation (summarization, translation): ROUGE (n-gram overlap), BLEU (precision on n-grams), human evaluation (expensive). Token classification (NER): micro/macro F1 (per-token). Semantic similarity: cosine similarity, human correlation. NEVER use accuracy alone for text tasks — almost all are imbalanced. For LLMs: use LLM-as-judge (ask GPT to score quality) for generation tasks.

Region	Junior	Mid	Senior
USA	$120k	$180k	$280k
UK	£70k	£110k	£160k
EU	€75k	€120k	€180k
CANADA	C$125k	C$185k	C$290k

Natural Language Processing (NLP)

What is Natural Language Processing (NLP)

📋 Before you start

💰 Salary by region

🎓 Certifications

🎯 Careers using Natural Language Processing (NLP)

⚖ Compare with

❓ FAQ

Not sure this skill is for you?

Find your ideal career path