Droid future draws near with Google PaLM-E

Mar 10, 2023

Google

Advanced deep learning models such as GPT-3 have paved the way for chatbot development, but physical robots have not been left behind. Recently, Google and Microsoft have delved into using similar AI models to enhance the capabilities of robots, resulting in impressive outcomes. Google PaLM-E

A new AI model called PaLM-E has been introduced by researchers at Google and the Berlin Institute of Technology. It integrates both language and vision skills to allow robots to operate independently in real-world situations, such as retrieving a chip bag from a kitchen or organizing colored blocks into designated areas of a rectangle.

PaLM-E is based on its previous large language model, PaLM. The "E" in the name refers to the model's ability to interact with physical objects and control robots. PaLM-E is also built upon Google's RT-1 model, which processes robot inputs and outputs actions, such as camera images, task instructions, and motor commands. The AI employs ViT-22B, a vision transformer model, to perform various tasks like image classification, object detection, and image captioning.

PaLM-E was appreciated by many authorities

This AI model is the most extensive Visual Language Model (VLM) to date, with 562 billion parameters. The AI boasts various abilities, including mathematical reasoning, multi-image reasoning, and chain-of-thought reasoning. The researchers explained in a report that the AI's skills are transferable across tasks through multi-task training, instead of being trained on individual tasks.

PaLM-E is an illustration of how the increased scale and advancement of large language models lead to improved capabilities, such as the ability to perform multimodal tasks with greater ease, accuracy, and autonomy.

All these features have been praised by many professors. It seems that the use of AI technologies in physical actions is even closer than we think.

According to Jeff Clune, an Associate Professor of Computer Science at the University of British Columbia, as reported by Motherboard:

“This work represents a major step forward, but on an expected path. It extends recent, exciting work out of DeepMind to the important and difficult arena of robotics (their work on ‘Frozen’ and ‘Flamingo’). More broadly, it is part of the recent tsunami of amazing AI advances that combine a simple, but powerful formula”.

Google is not alone in the VLM market

In addition to Google, Microsoft has also been exploring the application of multimodal AI and large language models in robotics. Microsoft's research involves extending the capabilities of ChatGPT to robotics and introducing a multimodal model named Kosmos-1, which can perform tasks such as image content analysis, visual puzzle-solving, visual recognition, and IQ tests.

According to Microsoft researchers' report, the integration of language models and robotic capabilities is a significant step toward creating artificial general intelligence (AGI) that possesses a level of intelligence comparable to human beings.

However, the researchers acknowledge that there are still real-world challenges to be addressed, such as navigating around obstacles in a kitchen or avoiding the risk of slipping.

Droid future draws near with Google PaLM-E

PaLM-E was appreciated by many authorities

Google is not alone in the VLM market

Related content

Tutorials & Tips

MusicLM: Google Music AI is here to change the music industry

What is Chrome Refresh 2023 and how to use it

How to indent on Google Docs

How to add music to Google Slides

Comments

Leave a Reply Cancel reply

Advertisement

Spread the Word

Advertisement

Hot Discussions

Advertisement

Recently Updated

Latest from Softonic

Advertisement

About gHacks

Droid future draws near with Google PaLM-E

PaLM-E was appreciated by many authorities

Google is not alone in the VLM market

Related content

Google is releasing previously Pixel-exclusive AI tools to all Google Photos users

Google announces improved Find My Device network for Android devices

Google considers charging you extra for AI-powered Google Search features

Google AI: Goodbye Bard: Gemini Advanced and Google One AI launches

Google confirms that cache links have been removed from search results

The 10 best hidden Google Games that you can play in your browser

Tutorials & Tips

MusicLM: Google Music AI is here to change the music industry

What is Chrome Refresh 2023 and how to use it

How to indent on Google Docs

How to add music to Google Slides

Comments

Leave a Reply Cancel reply

Advertisement

Spread the Word

Advertisement

Hot Discussions

Advertisement

Recently Updated

Latest from Softonic

Advertisement

About gHacks