This could be an interesting new paper!
"... Now, a new approach developed by MIT researchers enables LLMs to update themselves in a way that permanently internalizes new information. Just like a student, the LLM generates its own study sheets from a user’s input, which it uses to memorize the information by updating its inner workings.
The model generates multiple self-edits to learn from one input, then applies each one to see which improves its performance the most. This trial-and-error process teaches the model the best way to train itself.
The researchers found this approach improved the accuracy of LLMs at question-answering and pattern-recognition tasks, and it enabled a small model to outperform much larger LLMs. ..."
From the abstract:
"Large language models (LLMs) are powerful but static; they lack mechanisms to adapt their weights in response to new tasks, knowledge, or examples.
We introduce Self-Adapting LLMs (SEAL), a framework that enables LLMs to self-adapt by generating their own finetuning data and update directives.
Given a new input, the model produces a self-edit-a generation that may restructure the information in different ways, specify optimization hyperparameters, or invoke tools for data augmentation and gradient-based updates.
Through supervised finetuning (SFT), these self-edits result in persistent weight updates, enabling lasting adaptation.
To train the model to produce effective self-edits, we use a reinforcement learning loop with the downstream performance of the updated model as the reward signal. Unlike prior approaches that rely on separate adaptation modules or auxiliary networks, SEAL directly uses the model's own generation to control its adaptation process.
Experiments on knowledge incorporation and few-shot generalization show that SEAL is a promising step toward language models capable of self-directed adaptation. Our website and code is available at this https URL."
No comments:
Post a Comment