Robot skill acquisition via representation sharing and reward conditioning

dc.contributorGraduate Program in Computer Engineering.
dc.contributor.advisorUğur Emre.
dc.contributor.authorAkbulut, Mete Tuluhan.
dc.date.accessioned2023-03-16T10:05:29Z
dc.date.available2023-03-16T10:05:29Z
dc.date.issued2021.
dc.description.abstractSkill acquisition is a character trait of intelligent behavior, which Robot Learning aims to give to robots. An effective approach is to teach an initial version of the skill by demonstrating as a form of Supervised Learning (SL), called Learning from Demonstrations (LfD), then let the robot improve it and adapt to novel tasks via Reinforcement Learning (RL). In this thesis, we first propose a novel LfD+RL framework, Adaptive Conditional Neural Movement Primitives (ACNMP), that simultaneously utilizes LfD and RL together during adaptation and makes demonstrations and RL guided trajectories share the same latent representation space. We show through simulation experiments that (i) ACNMP successfully adapts the skill using order of magnitude fewer trajectory samples than baselines; (ii) its simultaneous training method preserves the demonstration characteristics; (iii) ACNMP enables skill transfer between robots with different morphologies. Our real-world experiments verify the suitability of ACNMP in real-world applications where non-linearity and the number of dimensions increases. Next, we extend the idea of using SL in reward-based skill learning tasks and propose our second framework called Reward Conditioned Neural Movement Primitives (RC-NMP), where learning is done using only SL. RC-NMP takes rewards as input, generates trajectories conditioned on desired rewards. The model uses variational inference to create a stochastic latent representation space from where varying trajectories are sampled to create a trajectory population. Finally, the diversity of the population is increased using crossover and mutation operations from Evolutionary Strategies to handle environments with sparse rewards, multiple solutions, or local minima. Our simulation and real-world experiments show that RC-NMP is more stable and efficient than ACNMP and two other robotic RL algorithms.
dc.format.extent30 cm.
dc.format.pagesxiv, 54 leaves ;
dc.identifier.otherCMPE 2021 A43
dc.identifier.urihttps://digitalarchive.library.bogazici.edu.tr/handle/123456789/12457
dc.publisherThesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2021.
dc.subject.lcshMachine learning.
dc.subject.lcshAbility.
dc.subject.lcshRobotics -- Programming.
dc.titleRobot skill acquisition via representation sharing and reward conditioning

Files

Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
b2765693.036875.001.PDF
Size:
6.05 MB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
b2765693.036876.001.zip
Size:
9.18 MB
Format:
Unknown data format

Collections