Aren't the LLM's already trained on the whole web? no need to RAG, in theory.

simonw · on May 5, 2024

Training doesn't work like that. Just because a model has been exposed to text in its training data doesn't mean the model will "remember" the details of that text.

Llama 3 was trained on 15 trillion tokens, but I can download a version of that model that's just 4GB in size.

No matter how "big" your model is there is still scope for techniques like RAG if you want it to be able to return answers grounded in actual text, as opposed to often-correct hallucinations spun up from the giant matrices of numbers in the model weights.

johnsutor · on May 5, 2024

They're only trained up to a certain point in time, so adding RAG should hypothetically allow such LLMs to access the most up-to-date information.

ore0s · on May 5, 2024

GPT-2 was launched in 2019, followed by GPT-3 in 2020, and GPT-4 in 2023. RAG is necessary to bridge informational gaps in between long LLM release cycles.