Policy optimisation and generalisation for reinforcement learning agents in sparse reward navigation environments.

Jeewa, Asad.

Policy optimisation and generalisation for reinforcement learning agents in sparse reward navigation environments.

dc.contributor.advisor	Pillay, Anban Woolaganathan.
dc.contributor.advisor	Jembere, Edgar.
dc.contributor.author	Jeewa, Asad.
dc.date.accessioned	2023-04-13T13:08:53Z
dc.date.available	2023-04-13T13:08:53Z
dc.date.created	2021
dc.date.issued	2021
dc.description	Masters Degree. University of KwaZulu-Natal, Durban.	en_US
dc.description.abstract	Sparse reward environments are prevalent in the real world and training reinforcement learning agents in them remains a substantial challenge. Two particularly pertinent problems in these environments are policy optimisation and policy generalisation. This work is focused on the navigation task in which agents learn to navigate past obstacles to distant targets and are rewarded on completion of the task. A novel compound reward function, Directed Curiosity, a weighted sum of curiosity-driven ex-ploration and distance-based shaped rewards is presented. The technique allowed for faster convergence and enabled agents to gain more rewards than agents trained with the distance-based shaped rewards or curiosity alone. However, it resulted in policies that were highly optimised for the specific environment that the agents were trained on, and therefore did not generalise well to unseen environments. A training curricu-lum was designed for this purpose and resulted in the transfer of knowledge, when using the policy “as-is”, to unseen testing environments. It also eliminated the need for additional reward shaping and was shown to converge faster than curiosity-based agents. Combining curiosity with the curriculum provided no meaningful benefits and exhibited inferior policy generalisation.	en_US
dc.identifier.uri	https://researchspace.ukzn.ac.za/handle/10413/21412
dc.language.iso	en	en_US
dc.subject.other	Curriculum learning.	en_US
dc.subject.other	Machine learning--Reinforcement learning.	en_US
dc.subject.other	Incentive awards.	en_US
dc.subject.other	Employee motivation.	en_US
dc.title	Policy optimisation and generalisation for reinforcement learning agents in sparse reward navigation environments.	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Asad_Jeewa_2021.pdf
Size:: 21.97 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.64 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masters Degrees (Computer Science)