16 Comments
User's avatar
Nathan Lambert's avatar

Not enough people talking about this obvious fact that most people shouldn't bother training models (I very cursoryily skimmed mostly the title). It was the TLDR of my talk on "post training for applications" https://www.youtube.com/watch?v=grpc-Wyy-Zg

Most people should just not do it!

Expand full comment
ToxSec's avatar

Agreed!

Expand full comment
Binh Pham's avatar

lol that comment on british and french was cold

Expand full comment
Devansh's avatar

Thank you

Expand full comment
Maria Sukhareva's avatar

Correct. When you are fine-tuning in the sense if you decide to change all weights, you are risking to break the model.

PEFT types of fine-tuning are much safer like Lora, it’s still fine-tuning but it changes a subset of weights and it works great.

I would not say RAG though is a way to bring new knowledge to LLMs.

Even in a RAG setup, the assembled prompt, user question plus retrieved passages is still just a query to the model. The model isn’t learning new information; it’s interpreting the concatenated text through its existing patterns.

If your domain is too far away from what a LLM saw, RAG won’t help, it will spiral in hallucinations

Expand full comment
Devansh's avatar

True that re RAG. But as far as I've seen, that's not a usecase we've ever dealt with. Where have you seen it happen?

For a lot more specialized domains, we tend to use agentic RAG (plan information that's needed, try to retrieve, critique information, replan based on this and loop till information quality matches informatoin). The critique part is a mixture of custom dynamic reward models and traditional search techniques to help with speed.

So far whatever we've tried this on works fairly well. But we don't work with very specialized domains.

Expand full comment
Maria Sukhareva's avatar

Those were factory automation manuals, not sure how exactly they are called but that’s what they were about. Internal restricted data so the model couldn’t have them in the training data

Expand full comment
Esborogardius Antoniopolus's avatar

Isn't everyone doing LoRA nowadays when they say they are fine tunning?

Expand full comment
ToxSec's avatar

This was really good read. Thank you. I've built a few RAGs and dabbled with Fine-Tuning, but never liked it for some reason. The RAG part was great, I thought I just didn't have the skill for the Fine-Tuning hah.

Expand full comment
Devansh's avatar

<3

Expand full comment
Andrew Duncan's avatar

But isn't anyone who is doing fine-tuning really using lora adapters? Who's got the compute to be doing full fine-tuning of big models? So the capacity argument is less clear.

Expand full comment
Vincent's avatar

This post is hilarious. People like this author are the ones vetting start-ups? Please. The idea that alignment leads to a degradation in model utility is hardly news.

But let’s be clear: fine-tuning an LLM to specialize in a task isn’t just about minimizing utility loss. It’s about trade-offs. You have to weigh what you gain against what you lose.

Expand full comment
Devansh's avatar

Didn't deny that, but keep in mind this is written for people that don't know that tuning has massive risks (which is many more than you'd think)

Expand full comment
Sten Rüdiger's avatar

Great post! Adapters are still underappreciated. You may be interested in this novel adapter method:

https://stenruediger.substack.com/p/supercharge-your-llms-introducing

Expand full comment
Omar Masrur's avatar

I think the title is quite misleading. I have hardly seen suggestions of fine-tuning an entire model: most talk about LORA. And what about fine-tuning embedding models and re-rankers?

Expand full comment
Harrison's avatar

Fascinating read! I’m Harrison, an ex fine dining line cook. My stack "The Secret Ingredient" adapts hit restaurant recipes for easy home cooking.

check us out:

https://thesecretingredient.substack.com

Expand full comment