Spatial AI for Robotics - Friends of Imperial College
What is SLAM
SLAM stands for "Simultaneous Localisation & Mapping". When a robot is interacting with its environment it has two main things it may need to do. First is "Localisation" which is where the robot has to identify where on a given map the robot's location is. The Second is "Mapping" which is when a robot has to create a map of the environment when given its location. S(simultaneous)LAM as its name hints at is when a robot has to do both localisation and mapping at the same time. Now this is the really difficult task as typical each of these rely upon having fixed data about the other - but when introduced to a completely new environment with no information about where it is the robot must use SLAM.
The fundamental workings of SLAM is as follows: the robot will have a sensors (cameras or LiDAR), which constantly look out for landmarks. Each time a new landmark is identifies, it creates a probabilistic map of where the object could be (is it a small object close up or a large object far a away (a point cloud) - this range of points of location is what is mapped on the probabilistic map). Then the robot moves, and keeps track of its relative position. Now in its new locations it continues to create a probabilistic map of what can be seen. The key here is that it repeatedly/continually observes all the landmarks it has ever identifies - and so the probabilistic location of an object can be refined by viewing from multiple angles and positions. The uncertainty reduces, and through this continuous tracking of relative position and refining of the probabilistic map, soon a map of its environment can be crafted!
Different versions of SLAM
- MonoSLAM: Sparse Feature-Based SLAM (2003) focused on egomotion(relative motion) tracking
- Dense Visual SLAM: Real-time dense reconstruction and direct camera tracking
- SLAM++: Object Oriented SLAM Real-time object level dense SLAM with CAD models
- Normal Prediction and Super Primitives: Always on single-frame normal prediction (identifies segments of an object, creates normals which it can check with other images)
- MAST3R-SLAM: Off-the-shelf MAST3R as strong 2-view prior
Visual SLAM-Enabled Products
- Robotic vacuum cleaners (e.g.Dyson, iRobot)
- AR Frameworks like AR kit
- Meta/Apple/Microsoft(Hololens)
- general purpose robots
- smart glasses
- autonomous driving