Python Minimax Algorithm

Information-Theoretic Minimax Regret Bounds for Reinforcement Learning based on Duality

Abstract: We study agents acting in an unknown environment where the agent’s goal is to find a robust policy. We consider robust policies as policies that achieve high cumulative rewards for all ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Information-Theoretic Minimax Regret Bounds for Reinforcement Learning based on Duality

Trending now