ROS Theses Repository

View Item 
  •   ROS Home
  • Mathematical & Computer Sciences
  • Doctoral Theses (Mathematical & Computer Sciences)
  • View Item
  •   ROS Home
  • Mathematical & Computer Sciences
  • Doctoral Theses (Mathematical & Computer Sciences)
  • View Item
  •   ROS Home
  • Mathematical & Computer Sciences
  • Doctoral Theses (Mathematical & Computer Sciences)
  • View Item
  • Admin
JavaScript is disabled for your browser. Some features of this site may not work without it.

Relational knowledge and representation for reinforcement learning

View/Open
NgJunHaoA_0822_macsSS.pdf (11.18Mb)
Date
2022-08
Author
Ng Jun Hao, Alvin
Metadata
Show full item record
Abstract
In reinforcement learning, an agent interacts with the environment, learns from feedback about the quality of its actions, and improves its behaviour or policy in order to maximise its expected utility. Learning efficiently in large scale problems is a major challenge. State aggregation is possible in problems with a first-order structure, allowing the agent to learn in an abstraction of the original problem which is of considerably smaller scale. One approach is to learn the Q-values of actions which are approximated by a relational function approximator. This is the basis for relational reinforcement learning (RRL). We abstract the state with first-order features which consist of only variables, thereby aggregating similar states from all problems of the same domain to abstract states. We study the limitations of RRL due to this abstraction and introduce the concepts of consistent abstraction, subsumption of problems, and abstract-equivalent problems. We propose three methods to overcome the limitations, extending the types of problems our RRL method can solve. Next, to further improve the learning efficiency, we propose to learn different types of generalised knowledge. The policy is influenced by directed exploration based on multiple types of intrinsic rewards and avoids previously encountered dead ends. In addition, we incorporate model-based techniques to provide better quality estimates of the Q-values. Transfer learning is possible by directly leveraging the generalised knowledge to accelerate learning in a new problem. Lastly, we introduce a new class of problems which considers dynamic objects and time-bounded goals. We discuss the complications these bring to RRL and present some solutions. We also propose a framework for multi-agent coordination to achieve joint goals represented by time-bounded goals by decomposing a multi-agent problem into single-agent problems. We evaluate our work empirically in six domains to demonstrate its efficacy in solving large scale problems and transfer learning.
URI
http://hdl.handle.net/10399/4679
Collections
  • Doctoral Theses (Mathematical & Computer Sciences)

Browse

All of ROSCommunities & CollectionsBy Issue DateAuthorsTitlesThis CollectionBy Issue DateAuthorsTitles

ROS Administrator

LoginRegister
©Heriot-Watt University, Edinburgh, Scotland, UK EH14 4AS.

Maintained by the Library
Tel: +44 (0)131 451 3577
Library Email: libhelp@hw.ac.uk
ROS Email: open.access@hw.ac.uk

Scottish registered charity number: SC000278

  • About
  • Copyright
  • Accessibility
  • Policies
  • Privacy & Cookies
  • Feedback
AboutCopyright
AccessibilityPolicies
Privacy & Cookies
Feedback
 
©Heriot-Watt University, Edinburgh, Scotland, UK EH14 4AS.

Maintained by the Library
Tel: +44 (0)131 451 3577
Library Email: libhelp@hw.ac.uk
ROS Email: open.access@hw.ac.uk

Scottish registered charity number: SC000278

  • About
  • Copyright
  • Accessibility
  • Policies
  • Privacy & Cookies
  • Feedback
AboutCopyright
AccessibilityPolicies
Privacy & Cookies
Feedback