Question 1

Full fine-tune vs LoRA — which should I use?

Accepted Answer

LoRA: 1-2% of parameters, 90% faster, fits single GPU, 90% of quality gain. Full fine-tune: all parameters, slower, needs multi-GPU, slightly higher quality, overfitting risk on small datasets. Start with LoRA for exploration; full fine-tune only if you have >100k labeled examples and compute budget.

Question 2

When should I fine-tune vs use RAG vs better prompting?

Accepted Answer

Prompting: < 1 week, free, baseline. RAG: 1-2 weeks, retrieval-based, no training data needed, good for knowledge updates. Fine-tuning: 2-8 weeks, requires labeled data, permanent model changes, best when general model fundamentally misses your domain (medical, legal, code).

Question 3

How much training data do I need for LoRA?

Accepted Answer

LoRA: 500-5000 examples (domain-specific task). Full fine-tune: 5000-100k+. Quality > quantity: 500 excellent examples beat 10k mediocre ones. BLEU, ROUGE, or task-specific metrics guide sufficiency. Start with 10% of your budget to validate data quality.

Question 4

What's RLHF and DPO, and when do I use them?

Accepted Answer

RLHF (Reinforcement Learning from Human Feedback): align model to human preferences via reward model, computationally expensive, industry standard (ChatGPT). DPO (Direct Preference Optimization): simpler, no reward model, same alignment quality, emerging 2024-2026. Use DPO for most cases unless you have production RLHF infrastructure.

Question 5

How do I evaluate a fine-tuned model?

Accepted Answer

Automated: BLEU/ROUGE (translation/summarization), METEOR, exact match (QA). Human: n=100 blind test vs baseline. Business: latency, cost, throughput. Always test on held-out set (10-20% of data) to catch overfitting. Compare model outputs qualitatively.

Question 6

What are GPU cost implications for fine-tuning?

Accepted Answer

LoRA on single GPU (H100): $2-8/hour. Full fine-tune on multi-GPU: $50-300+/hour. 1-week project: $2k-10k total. OpenAI API (per 1M tokens): $1.50-$30 depending on model. Calculate ROI: if fine-tuned model saves 10% on inference costs or increases quality by 20%, payoff within months.

Question 7

Which base model should I fine-tune — 7B, 13B, 70B, or use an API?

Accepted Answer

7B (Mistral, Llama-2): fastest, cheapest, edge deployment, good for latency-critical tasks. 13B-70B: better quality, slower. 70B: near-GPT4 but expensive to run. API fine-tuning (OpenAI, Anthropic): zero ops, blackbox, most expensive. For startups: start 7B LoRA locally, graduate to 13B, switch to API only if custom tuning impossible.

Region	Junior	Mid	Senior
USA	$140k	$210k	$280k
UK	£75k	£118k	£160k
EU	€85k	€130k	€175k
CANADA	C$145k	C$218k	C$290k

LLM Fine-Tuning

What is LLM Fine-Tuning

📋 Before you start

💰 Salary by region

🎓 Certifications

🎯 Careers using LLM Fine-Tuning

⚖ Compare with

❓ FAQ

Not sure this skill is for you?

Find your ideal career path