Monday, March 11, 2024

On Covariant - Introducing RFM-1: Giving robots human-like reasoning capabilities

These seems to be a hot item in machine learning & AI! Covariant is a spin off of OpenAI. Hype or real?

"... RFM-1 — a Robotics Foundation Model trained on both general internet data as well as data that is rich in physical real-world interactions — represents a remarkable leap forward toward building generalized AI models that can accurately simulate and operate in the demanding conditions of the physical world. ...
Starting with a wide variety of operations in the warehouse automation space, our new generation of AI (RFM-1) showcases the power of Robotics Foundation Models. Our approach of combining the largest real-world robot production dataset, along with a massive collection of internet data, is unlocking new levels of accuracy and productivity in warehouse applications, and shows a clear path to expanding to other robotic form factors as well as broader industry applications. ...
Set up as a multimodal any-to-any sequence model, RFM-1 is an 8 billion parameter transformer trained on text, images, videos, robot actions, and a range of numerical sensor readings.
By tokenizing all modalities into a common space and performing autoregressive next-token prediction, RFM-1 uses its broad range of input and output modalities to enable diverse applications. ...
RFM-1’s understanding of physics emerges from learning to generate videos: with input tokens of an initial image and robot actions, it acts as a physics world model to predict future video tokens. ...
Language-guided robot programming
RFM-1 allows robot operators and engineers to instruct robots to perform specific picking actions using plain English. ..."

Introducing RFM-1: Giving robots human-like reasoning capabilities (among the authors is Pieter Abbeel)

No comments: