This paper introduces a novel method of adding intrinsic bonuses to task-oriented reward function in order efficiently facilitate reinforcement learning search. While various have been designed date, they are analogous the depth-first and breadth-first search algorithms graph theory. paper, therefore, first designs two for each them. Then, heuristic gain scheduling is applied bonuses, inspired ...