Learn With A Robot

Learn With A Robot

Share this post

Learn With A Robot
Learn With A Robot
Making robots perform more generalized tasks

Making robots perform more generalized tasks

We explore Robotic Transformers 2 (RT-2) from Google Deepmind

Amitabha Banerjee's avatar
Amitabha Banerjee
Aug 24, 2023
∙ Paid

Share this post

Learn With A Robot
Learn With A Robot
Making robots perform more generalized tasks
Share

Introduction

Making robots perform general tasks is one of the hardest barriers to cross before robots can perform more common activities in our daily lives. In this forum, we have explored how Generative AI can be used to build datasets that can be then used to train robots to perform generic tasks (If you are interested, please check our overview of GenAug and ROSIE)

In this article, we discuss a different approach based on Vision Language Action Models (VLAs) to accomplish the same task. Google Deepmind recently released their work on Robotics Transformer 2 (RT-2) which received quite a bit of attention in the press. We will dive into the details of RT-2 in this article.

RT-1

Before we delve into RT-2, lets us go through the basics of RT-1, the first version of Robotics Transformer from the Google Deepmind team. RT-1 had the same goal as RT-2: to train a robot to perform general purpose tasks. To train RT-1, researchers at Google Deepmind developed a real-world robotics dataset of 130…

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Learn With A Robot
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share