dc.description.abstract | Traditionally, robots have been confined to settings where they operate in isolation and in highly
controlled and structured environments to execute well-defined non-varying tasks. As a result,
they usually operate without the need to perceive their surroundings or to adapt to changing
stimuli. However, as robots start to move towards human-centred environments and share the
physical space with people, there is an urgent need to endow them with the flexibility to learn
and adapt given the changing nature of the stimuli they receive and the evolving requirements
of their users. Standard machine learning is not suitable for these types of applications because
it operates under the assumption that data samples are independent and identically distributed,
and requires access to all the data in advance. If any of these assumptions is broken, the model
fails catastrophically, i.e., either it does not learn or it forgets all that was previously learned.
Therefore, different strategies are required to address this problem.
The focus of this thesis is on lifelong object learning, whereby a model is able to learn
from data that becomes available over time. In particular we address the problem of classincremental learning with an emphasis on algorithms that can enable interactive learning with
a user. In class-incremental learning, models learn from sequential data batches where each
batch can contain samples coming from ideally a single class. The emphasis on interactive
learning capabilities poses additional requirements in terms of the speed with which model
updates are performed as well as how the interaction is handled.
The work presented in this thesis can be divided into two main lines of work. First,
we propose two versions of a lifelong learning algorithm composed of a feature extractor
based on pre-trained residual networks, an array of growing self-organising networks and a
classifier. Self-organising networks are able to adapt their structure based on the input data
distribution, and learn representative prototypes of the data. These prototypes can then be
used to train a classifier. The proposed approaches are evaluated on various benchmarks under
several conditions and the results show that they outperform competing approaches in each
case. Second, we propose a robot architecture to address lifelong object learning through
interactions with a human partner using natural language. The architecture consists of an
object segmentation, tracking and preprocessing pipeline, a dialogue system, and a learning
module based on the algorithm developed in the first part of the thesis. Finally, the thesis also
includes an exploration into the contributions that different preprocessing operations have on
performance when learning from both RGB and Depth images. | en |