Robot skill acquisition via representation sharing and reward conditioning

Akbulut, Mete Tuluhan.

Robot skill acquisition via representation sharing and reward conditioning

dc.contributor	Graduate Program in Computer Engineering.
dc.contributor.advisor	Uğur Emre.
dc.contributor.author	Akbulut, Mete Tuluhan.
dc.date.accessioned	2023-03-16T10:05:29Z
dc.date.available	2023-03-16T10:05:29Z
dc.date.issued	2021.
dc.description.abstract	Skill acquisition is a character trait of intelligent behavior, which Robot Learning aims to give to robots. An effective approach is to teach an initial version of the skill by demonstrating as a form of Supervised Learning (SL), called Learning from Demonstrations (LfD), then let the robot improve it and adapt to novel tasks via Reinforcement Learning (RL). In this thesis, we first propose a novel LfD+RL framework, Adaptive Conditional Neural Movement Primitives (ACNMP), that simultaneously utilizes LfD and RL together during adaptation and makes demonstrations and RL guided trajectories share the same latent representation space. We show through simulation experiments that (i) ACNMP successfully adapts the skill using order of magnitude fewer trajectory samples than baselines; (ii) its simultaneous training method preserves the demonstration characteristics; (iii) ACNMP enables skill transfer between robots with different morphologies. Our real-world experiments verify the suitability of ACNMP in real-world applications where non-linearity and the number of dimensions increases. Next, we extend the idea of using SL in reward-based skill learning tasks and propose our second framework called Reward Conditioned Neural Movement Primitives (RC-NMP), where learning is done using only SL. RC-NMP takes rewards as input, generates trajectories conditioned on desired rewards. The model uses variational inference to create a stochastic latent representation space from where varying trajectories are sampled to create a trajectory population. Finally, the diversity of the population is increased using crossover and mutation operations from Evolutionary Strategies to handle environments with sparse rewards, multiple solutions, or local minima. Our simulation and real-world experiments show that RC-NMP is more stable and efficient than ACNMP and two other robotic RL algorithms.
dc.format.extent	30 cm.
dc.format.pages	xiv, 54 leaves ;
dc.identifier.other	CMPE 2021 A43
dc.identifier.uri	https://digitalarchive.library.bogazici.edu.tr/handle/123456789/12457
dc.publisher	Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2021.
dc.subject.lcsh	Machine learning.
dc.subject.lcsh	Ability.
dc.subject.lcsh	Robotics -- Programming.
dc.title	Robot skill acquisition via representation sharing and reward conditioning

Files

Original bundle

Now showing 1 - 2 of 2

Name:: b2765693.036875.001.PDF
Size:: 6.05 MB
Format:: Adobe Portable Document Format

Download

Name:: b2765693.036876.001.zip
Size:: 9.18 MB
Format:: Unknown data format

Download

Collections

M.S. Theses