Robust dense visual SLAM using sensor fusion and motion segmentation
Abstract
Visual simultaneous localisation and mapping (SLAM) is an important technique for
enabling mobile robots to navigate autonomously within their environments. Using
cameras, robots reconstruct a representation of their environment and simultaneously
localise themselves within it. A dense visual SLAM system produces a high-resolution
and detailed reconstruction of the environment which can be used for obstacle avoidance or semantic reasoning.
State-of-the-art dense visual SLAM systems demonstrate robust performance and
impressive accuracy in ideal conditions. However, these techniques are based on requirements which limit the extent to which they can be deployed in real applications.
Fundamentally, they require constant scene illumination, smooth camera motion and
no moving objects being present in the scene. Overcoming these requirements is not
trivial and significant effort is needed to make dense visual SLAM approaches more
robust to real-world conditions.
The objective of this thesis is to develop dense visual SLAM systems which are
more robust to real-world visually challenging conditions. For this, we leverage sensor
fusion and motion segmentation for situations where camera data is unsuitable.
The first contribution is a visual SLAM system for the NASA Valkyrie humanoid
robot which is robust to the robot’s operation. It is based on a sensor fusion approach
which combines visual SLAM and leg odometry to demonstrate increased robustness
to illumination changes and fast camera motion.
Second, we research methods for robust visual odometry in the presence of moving
objects. We propose a formulation for joint visual odometry and motion segmentation
that demonstrates increased robustness in scenes with moving objects compared to
state-of-the-art approaches.
We then extend this method using inertial information from a gyroscope to compare the contributions of motion segmentation and motion prior integration for robustness to scene dynamics. As part of this study we provide a dataset recorded in
scenes with different numbers of moving objects.
In conclusion, we find that both motion segmentation and motion prior integration
are necessary for achieving significantly better results in real-world conditions. While
motion priors increase robustness, motion segmentation increases the accuracy of the
reconstruction results through filtering of moving objects.