Fine-tuning adapts pre-trained large language models to specialized domains or tasks. Instead of using GPT-4 for everything (expensive, generic), you fine-tune a smaller, cheaper model (Llama 2, Mistral, Qwen) on your domain data (medical records, legal documents, customer tickets). The result: domain-specific accuracy + lower cost. Parameter-Efficient Fine-Tuning (PEFT) methods like LoRA dramatically reduce the cost and data requirements. Instead of updating 70B parameters, you update 1B parameters in low-rank matrices. Now fine-tuning requires thousands of examples instead of millions, fits on consumer GPUs, and runs in hours instead of days.