Robotic systems that are developed for social and dynamic environments require adaptive mechanisms to successfully operate. Consequently, learning from rewards has provided meaningful results in applications involving human-robot interaction. In those cases where the robot's state space number of actions is extensive, dimensionality becomes intractable this drastically slows down process. This ...