MLNews

Introducing DiVA-360 Camera: Robots Can Experience Real-world Scenes and Sounds!

Introducing a special camera that can capture high-quality audio and video from different angles at the same time.The DiVA-360 camera is collecting images and videos to teach robots how to work like a human. This camera can also take pictures of objects from all sides to see full view. It is focusing on improving the way computers “see” and “hear” the world. So for that, they have built a special camera that has a library of videos and pictures. Robots are learning how to interact with people in real-world by watching recorded videos.

Multiple researchers were included in this research and some of them are mention here including Cheng-You Lu, Peisen Zhou, Angela Xing, Chandradeep Pokhariya, Arnab Dey, Ishaan Shah, Rugved Mavidipalli, Dylan Hu, Andrew Comport, Kefan Chen, Srinath Sridhar. These researcher were associated with different universities such as Brown University and etc as it was published on 1st August 2023.

Beginnings to Unstoppable Innovation

In the past AI robots face many limitations when it comes to understanding real-world situations. Robots can identify images and videos but can not handle simple tasks. For example they can identify basic objects like cats and dogs in pictures but if they encounter a complex object it was difficult for them to identify the scene or object in the image. It was hard for the system to observe dynamic activities as the cameras used in the past were not advanced and take image from one angle. 

DiVA-360

They cannot capture high-quality images and videos as a result information provided to system was not very helpful. Past cameras take blur images with a narrow view that disable the system to understand the scene in the image. This way its performance lack as it cannot interact with the world like humans.

Introducing DiVA-360

DiVA-360 research project has created a advanced AI robot by using special cameras called TRICS that work like super cameras. TRICS camera can take amazing videos and record sound from different angles. It acts like, it has many eyes and ears that see and hear everything This powerful camera contain a huge library of videos and images of real-world scenarios. They used to capture motion scenes like kids playing with toys or the motion of characters in the film. They cover the whole scene by inducing stationary objects like table, chairs etc.

It allows the user to write descriptions on recorded videos for better understanding and helps AI to understand the real-world through simple words. Now AI can understand dynamic and complex scenes. As AI is becoming smarter with time it recognizes human action and dynamic environment.

Unleashing the Future:

The future impact of the “DiVA-360 camera” research is promising to the world as it enhance the capabilities of robots that will able to navigate and interact with its surrounding intelligently. This will increase the demand for robots in many industries such as manufacturing, healthcare, and in households. It will also be used in autonomous vehicles which will reduce the chance of car accidents caused by humans.

As these cars will recognize and respond to different objects while driving on the road. It will analyze real-world scenes and benefit the healthcare industry by assisting doctors in diagnosing disease more accurately and quickly. Moreover it will create a huge impact in gaming or the virtual world by creating realistic environment for people. It will revolutionize the way students learn by creating a virtual environment and training programs for them.

Potential Application

  • Virtual Reality and Augmented Reality
  • Gaming and Animation
  • Autonomous Vehicles
  • Robotics
  • Healthcare Simulation
  • Entertainment and Film Production
  • Human-Computer Interaction
  • Surveillance and Security
  • Environmental Monitoring
  • Education and Training

Availability

This research and its code is available on multiple sites such you can view its example on diva360.github.io and its research paper on arxiv.org and paperswithcode.com. Where as it dataset is also present on  paperswithcode.com.

Technical Details

The DiVA-360 research project has enhanced computer vision and revolutionized AI capabilities by developing advanced software and hardware called TRICS. TRICS can capture high-resolution images and high-quality videos while covering all angles of real-world scenes and objects.

All these images and videos are saved in its multi-model dataset including dynamic activities, interactions with objects, and static objects from all angles. Users can also add a description for its understandability. It will be used to understand the scene of complex images or videos. Not only that it captures sounds from the surrounding environment and works in multiple ways.

Conclusion

This project contains a comprehensive dataset of real-world scenes and objects. As TRICS can capture high-quality multi modal dataset that can recognize, and understand scenes of videos. It is observed that this model has the potential to cover major industries such as VR, gaming, health care, and autonomous vehicles that shows its advanced features.

Yo can also see blogs on new research on Smart AI Robots


Similar Posts

Signup MLNews Newsletter

What Will You Get?

Bonus

Get A Free Workshop on
AI Development