{"id":1844,"date":"2023-08-20T11:38:19","date_gmt":"2023-08-20T11:38:19","guid":{"rendered":"https:\/\/mlnews.dev\/?p=1844"},"modified":"2023-09-21T12:01:36","modified_gmt":"2023-09-21T12:01:36","slug":"visual-language-action-models","status":"publish","type":"post","link":"https:\/\/mlnews.dev\/visual-language-action-models\/","title":{"rendered":"RT-2 Visual-Language-Action Models: Empowering Robots with Power of Web Knowledge"},"content":{"rendered":"\n

Exciting! New Way to Teach Robots <\/a>Discovered! RT-2 Visual-Language-Action-Models is a clever method to use super smart AI language models to control robots. This RT-2 helps the robots do really smart things and understand stuff better!<\/p>\n\n\n\n

The ingenious minds at Google DeepMind crafted an extraordinary thing called RT-2. It’s like a big brain that knows about both pictures and words from the internet and how robots move. They used it to teach robots to do things better.<\/p>\n\n\n\n

Now, Visual-Language-Action Models, with names like RT-2-PaLI-X and RT-2-PaLM-E, can talk in robot language and do actions. This helps them learn how to do stuff and understand things, like following new directions and doing tasks that need thinking about objects and how they’re related.<\/p>\n\n\n

\n
\"robot<\/figure><\/div>\n\n\n

New Robot Skills Unlocked<\/strong><\/h2>\n\n\n\n

Past methods could only learn skills demonstrated in robotic datasets. Generalization was limited and reasoning abilities were minimal without huge amounts of robotic experience.<\/p>\n\n\n\n

The RT-2 policies exhibit dramatically improved generalization – up to 6X over baselines – and impressive reasoning abilities like placing objects according to symbols and relationships just from web-scale pretraining.<\/p>\n\n\n\n

The future looks bright. This technique could soon allow more capable real-world AI robotics without needing impractically large amounts of robotic data.<\/p>\n\n\n

\n
\"RT2<\/figure><\/div>\n\n\n

Get Hands-On with Visual-Language-Action Models<\/strong><\/h2>\n\n\n\n

The project website<\/a> has important stuff for people who are interested. You can find the code they used and also videos that show how well the robot works. But they’re using special models for understanding pictures and words that they made themselves. They don’t tell everything about these models, but they do give you instructions and code to help you make something similar.<\/p>\n\n\n\n

If you want to understand and do what they did, the project website is a good place to go. It has simple explanations, step-by-step guides, and pieces of code that can help you learn and try similar things. The website is like a big helpful toolbox that can show you how the robot does cool things and also help you learn to make your own cool things.<\/p>\n\n\n\n

<\/div><\/div><\/div><\/div><\/div>\n\n\n\n

Potential Applications<\/strong><\/h2>\n\n\n\n