GenAI Systems Lab Open interactive version →
AI Engineering 11 min read

The Fine-Tuning Playbook: LoRA, QLoRA, and When to Use Each

A practical decision framework for fine-tuning LLMs — from full parameter training to 4-bit QLoRA on consumer GPUs.

**Prerequisite: Step 5 (Pretraining Data) helps but not required.** After this post you'll know when fine-tuning is the right choice over prompting, what LoRA and PEFT actually do at a conceptual level, and how to decide if instruction-tuning your model is worth the cost.

Fine-tuning is the most misused tool in the modern ML stack. Teams fine-tune when they should be prompting, prompt when they should be fine-tuning, and almost always skip the step that matters most: building an eval harness before they start.

When Fine-Tuning Actually Wins

Fine-tuning beats prompting when: the task requires consistent output format at high volume; the model needs domain vocabulary it wasn't trained on; latency constraints make long system prompts expensive; or you're doing classification/extraction where a 7B fine-tuned model outperforms GPT-4 at 10% of the cost.

Fine-tuning doesn't add knowledge — it adjusts behaviour. If the base model doesn't know a fact, fine-tuning won't teach it that fact. Use RAG for knowledge, fine-tuning for style and format.

The Method Decision Framework

MethodWhen to useVRAMQuality ceiling
Full FTSignificant task distribution shift; large dataset (>50K examples)High (full model)Best
LoRAAdapter for a new task; <10K examples; limited GPU budgetMedium (adapter only)Near-full-FT
QLoRAConsumer GPU; 4-bit base + LoRA adapter; cost-constrainedLowSlightly below LoRA
Prompt tuningAPI-only access; very small dataset; fast iterationZeroLimited

The 5-Step Production Workflow

Dataset Quality Is the Bottleneck

The most common fine-tuning failure is dataset quality, not model choice or hyperparameters. Every low-quality example you include teaches the model to produce low-quality outputs. Filter aggressively: remove duplicates, remove examples where the reference output is itself wrong, and maintain label balance.

Dataset sizing heuristics

The Mistakes That Cost Teams the Most

The teams that fine-tune best treat it as a last resort, not a first resort. Exhaust prompting, RAG, and retrieval before reaching for gradient updates.

Try: Fine-Tuning Workflows module →:

Try it interactively

GenAI Systems Lab is a free platform for AI engineers — configure real failure modes, break things, and build the judgment that gets you hired.

Open GenAI Systems Lab →