Not enough people talking about this obvious fact that most people shouldn't bother training models (I very cursoryily skimmed mostly the title). It was the TLDR of my talk on "post training for applications" https://www.youtube.com/watch?v=grpc-Wyy-Zg
Correct. When you are fine-tuning in the sense if you decide to change all weights, you are risking to break the model.
PEFT types of fine-tuning are much safer like Lora, it’s still fine-tuning but it changes a subset of weights and it works great.
I would not say RAG though is a way to bring new knowledge to LLMs.
Even in a RAG setup, the assembled prompt, user question plus retrieved passages is still just a query to the model. The model isn’t learning new information; it’s interpreting the concatenated text through its existing patterns.
If your domain is too far away from what a LLM saw, RAG won’t help, it will spiral in hallucinations
True that re RAG. But as far as I've seen, that's not a usecase we've ever dealt with. Where have you seen it happen?
For a lot more specialized domains, we tend to use agentic RAG (plan information that's needed, try to retrieve, critique information, replan based on this and loop till information quality matches informatoin). The critique part is a mixture of custom dynamic reward models and traditional search techniques to help with speed.
So far whatever we've tried this on works fairly well. But we don't work with very specialized domains.
Those were factory automation manuals, not sure how exactly they are called but that’s what they were about. Internal restricted data so the model couldn’t have them in the training data
This was really good read. Thank you. I've built a few RAGs and dabbled with Fine-Tuning, but never liked it for some reason. The RAG part was great, I thought I just didn't have the skill for the Fine-Tuning hah.
But isn't anyone who is doing fine-tuning really using lora adapters? Who's got the compute to be doing full fine-tuning of big models? So the capacity argument is less clear.
This post is hilarious. People like this author are the ones vetting start-ups? Please. The idea that alignment leads to a degradation in model utility is hardly news.
But let’s be clear: fine-tuning an LLM to specialize in a task isn’t just about minimizing utility loss. It’s about trade-offs. You have to weigh what you gain against what you lose.
I think the title is quite misleading. I have hardly seen suggestions of fine-tuning an entire model: most talk about LORA. And what about fine-tuning embedding models and re-rankers?
Not enough people talking about this obvious fact that most people shouldn't bother training models (I very cursoryily skimmed mostly the title). It was the TLDR of my talk on "post training for applications" https://www.youtube.com/watch?v=grpc-Wyy-Zg
Most people should just not do it!
Agreed!
lol that comment on british and french was cold
Thank you
Correct. When you are fine-tuning in the sense if you decide to change all weights, you are risking to break the model.
PEFT types of fine-tuning are much safer like Lora, it’s still fine-tuning but it changes a subset of weights and it works great.
I would not say RAG though is a way to bring new knowledge to LLMs.
Even in a RAG setup, the assembled prompt, user question plus retrieved passages is still just a query to the model. The model isn’t learning new information; it’s interpreting the concatenated text through its existing patterns.
If your domain is too far away from what a LLM saw, RAG won’t help, it will spiral in hallucinations
True that re RAG. But as far as I've seen, that's not a usecase we've ever dealt with. Where have you seen it happen?
For a lot more specialized domains, we tend to use agentic RAG (plan information that's needed, try to retrieve, critique information, replan based on this and loop till information quality matches informatoin). The critique part is a mixture of custom dynamic reward models and traditional search techniques to help with speed.
So far whatever we've tried this on works fairly well. But we don't work with very specialized domains.
Those were factory automation manuals, not sure how exactly they are called but that’s what they were about. Internal restricted data so the model couldn’t have them in the training data
Isn't everyone doing LoRA nowadays when they say they are fine tunning?
This was really good read. Thank you. I've built a few RAGs and dabbled with Fine-Tuning, but never liked it for some reason. The RAG part was great, I thought I just didn't have the skill for the Fine-Tuning hah.
<3
But isn't anyone who is doing fine-tuning really using lora adapters? Who's got the compute to be doing full fine-tuning of big models? So the capacity argument is less clear.
This post is hilarious. People like this author are the ones vetting start-ups? Please. The idea that alignment leads to a degradation in model utility is hardly news.
But let’s be clear: fine-tuning an LLM to specialize in a task isn’t just about minimizing utility loss. It’s about trade-offs. You have to weigh what you gain against what you lose.
Didn't deny that, but keep in mind this is written for people that don't know that tuning has massive risks (which is many more than you'd think)
Great post! Adapters are still underappreciated. You may be interested in this novel adapter method:
https://stenruediger.substack.com/p/supercharge-your-llms-introducing
I think the title is quite misleading. I have hardly seen suggestions of fine-tuning an entire model: most talk about LORA. And what about fine-tuning embedding models and re-rankers?
Fascinating read! I’m Harrison, an ex fine dining line cook. My stack "The Secret Ingredient" adapts hit restaurant recipes for easy home cooking.
check us out:
https://thesecretingredient.substack.com