Designing coherent and engaging open-domain conversational AI systems
Abstract
Designing conversational AI systems able to engage in open-domain ‘social’ conversation is extremely challenging and a frontier of current research. Such systems are
required to have extensive awareness of the dialogue context and world knowledge,
the user intents and interests, requiring more complicated language understanding, dialogue management, and state and topic tracking mechanisms compared to
traditional task-oriented dialogue systems. Given the wide coverage of topics in
open-domain dialogue, the conversation can span multiple turns where a number of
complex linguistic phenomena (e.g. ellipsis and anaphora) are present and should
be resolved for the system to be contextually aware. Such systems also need to be
engaging, keeping the users’ interest over long conversations. These are only some
of the challenges that open-domain dialogue systems face. Therefore this thesis
focuses on designing dialogue systems able to hold extensive open-domain conversations in a coherent, engaging, and appropriate manner over multiple turns.
First, different types of dialogue systems architecture and design decisions
are discussed for social open-domain conversations, along with relevant evaluation
metrics. A modular architecture for ensemble-based conversational systems is
presented, called Alana, a finalist in the Amazon Alexa Prize Challenge in 2017 and
2018, able to tackle many of the challenges for open-domain social conversation.
The system combines different features such as topic tracking, contextual Natural
Language understanding, entity linking, user modelling, information retrieval, and
response ranking, using a rich representation of dialogue state.
The thesis next analyses the performance of the 2017 system and describes the
upgrades developed for the 2018 system. This leads to an analysis and comparison
of the real-user data collected in both years with different system configurations,
allowing assessment of the impact of different design decisions and modules.
Finally, Alana was integrated into an embodied robotic platform and enhanced
with the ability to also perform tasks. This system was deployed and evaluated
in a shopping mall in Finland. Further analysis of the added embodiment is presented and discussed, as well as the challenges of translating open-domain dialogue
systems into other languages. Data analysis of the collected real-user data shows
the importance of a variety of features developed and decisions made in the design
of the Alana system.