Common Sense: AI2 OLMoTrace points model output back to training data

Wednesday, April 09, 2025

AI2 OLMoTrace points model output back to training data

Good news! Does this new model also address hallucination? 😊

Always keep in mind: When was the training/last training update cutoff date? Or fast do these large language/multi-modal models get outdated?

"For years it’s been an open question — how much is a language model learning and synthesizing information, and how much is it just memorizing and reciting?

Introducing OLMoTrace, a new feature in the Ai2 Playground that begins to shed some light.

OLMoTrace connects phrases or even whole sentences in the language model’s output back to verbatim matches in its training data. It does this by searching billions of documents and trillions of tokens in real time and highlighting where it finds compelling matches. ...

Through OLMoTrace, you can gain insights into why the model generates certain sequences of words. ..."

OLMoTrace points model output back to training data

Wednesday, April 09, 2025

AI2 OLMoTrace points model output back to training data

No comments: