Friday, February 02, 2024

AI2: OLMo 7B: Open Language Model. A State-Of-The-Art, Truly Open LLM and Framework

Good news! Empowering AI research! Dive in!

"AI2 opens its framework for training and experimenting with large language models on Hugging Face and GitHub with the launch of our first Open Language Model (OLMo). The AI2 LLM framework is intentionally designed to provide access to data, training code, models, and evaluation code necessary to advance AI through open research to empower academics and researchers to study the science of language models collectively. This approach enables the AI community to access a broader range of research questions, such as understanding the specific impact of certain subsets of pretraining data on downstream performance or investigating new pretraining methods and understanding instabilities.
This effort's first batch of models includes four final variants of our language model at the 7B scale corresponding to different architectures, optimizers, and training hardware, and one model at the 1B scale, all trained on at least 2T tokens. This is the first step in a long series of planned releases, continuing with larger models, instruction-tuned models, and more variants down the line.

Each model comes with the following:
  • Full training data used for these models, including code that produces the training data, from AI2’s Dolma, and WIMBD for analyzing pretraining data.
  • Full model weights, training code, training logs, training metrics in the form of Weights & Biases logs, and inference code.
  • 500+ checkpoints per model, from every 1000 steps during the training process, available as revisions on HuggingFace.
  • Evaluation code under the umbrella of AI2’s Catwalk and Paloma.
  • Fine-tuning code and adapted models (coming soon with Open Instruct)
  • All code, weights, and intermediate checkpoints are released under the Apache 2.0 License.
 ..."

OLMo: Open Language Model. A State-Of-The-Art, Truly Open LLM and… | by AI2 | Feb, 2024 | AI2 Blog

No comments: