Bellman

AI Paper Review/Deep RL Papers [EN](20)

Munchausen Reinforcement Learning
https://arxiv.org/abs/2007.14430 0. TD error and bootstrapping in reinforcement learning Munchen Reinforcement Learning (M-RL) is actually a really simple idea. Bootstrapping is a core idea in reinforcement learning, especially in learning q-functions with a temporal difference error. for example, we don't know the optimal q function at t+1, but the agent could use it as a learning target. we re..
2021.05.31
Self-Imitation Advantage Learning (SAIL)
https://arxiv.org/abs/2012.11989 Self-Imitation Advantage Learning Self-imitation learning is a Reinforcement Learning (RL) method that encourages actions whose returns were higher than expected, which helps in hard exploration and sparse reward problems. It was shown to improve the performance of on-policy actor-critic m arxiv.org 1. Self imitation reinforcement learning Self-imitation learning..
2021.05.31

1 2 3 4

티스토리툴바