HyperAIHyperAI

Command Palette

Search for a command to run...

MIT Researchers Develop AI System for Robots to Rapidly Map Large Environments Using Flexible Submap Stitching

A team of MIT researchers has developed a new AI-powered system that enables robots to rapidly create accurate 3D maps of large, complex environments using only images from onboard cameras. This breakthrough is especially valuable in time-sensitive scenarios like search-and-rescue missions, where a robot must quickly navigate a collapsed building and build a detailed map of the area while locating survivors. Traditional methods for simultaneous localization and mapping (SLAM) often struggle in challenging environments or require precise camera calibration. While recent machine learning models have simplified implementation, they typically can only process a small number of images at once—limiting their use in real-world settings where robots must analyze thousands of images in real time. To solve this, the MIT team created a system that breaks down large scenes into smaller, manageable submaps. Instead of trying to process the entire environment at once, the model generates individual submaps incrementally and then stitches them together into a complete 3D reconstruction. This approach allows the robot to build a full map of a large space—such as a crowded office corridor or the interior of the MIT Chapel—within seconds. The key innovation lies in how the system handles alignment between submaps. Unlike conventional methods that rely solely on rotation and translation, the MIT model accounts for distortions that machine learning models can introduce—such as bent or stretched walls in a 3D reconstruction. By applying flexible mathematical transformations, the system ensures all submaps are deformed consistently, enabling accurate alignment even when the data is imperfect. The method works out of the box, requiring no special camera calibration or expert tuning. It was tested using short video clips captured on a smartphone, producing 3D reconstructions with an average error of less than 5 centimeters. The system also estimates the robot’s position in real time, which is essential for navigation. The research, led by MIT graduate student Dominic Maggio, with contributions from postdoc Hyungtae Lim and senior author Luca Carlone, combines modern deep learning with classical computer vision principles. Carlone, an associate professor in the Department of Aeronautics and Astronautics and director of the MIT SPARK Laboratory, notes that understanding the underlying geometry of the problem was critical to achieving high performance and scalability. The system outperforms existing methods in both speed and accuracy, making it a strong candidate for real-world deployment. Potential applications include enhancing wearable devices for extended reality, improving navigation for industrial robots in warehouses, and supporting emergency response teams in disaster zones. The work is supported by the U.S. National Science Foundation, the U.S. Office of Naval Research, and the National Research Foundation of Korea. Carlone, who was on sabbatical as an Amazon Scholar, completed the research before joining Amazon. The team plans to further refine the method for use in more complex and dynamic environments, with future work focused on real robot testing.

Related Links