Is Deep Reinforcement Learning Really Superhuman on Atari?

Abstract : Consistent and reproducible evaluation of Deep Reinforcement Learning (DRL) is not straightforward. In the Arcade Learning Environment (ALE), small changes in environment parameters such as stochasticity or the maximum allowed play time can lead to very different performance. In this work, we discuss the difficulties of comparing different agents trained on ALE. In order to take a step further towards reproducible and comparable DRL, we introduce SABER, a Standardized Atari BEnchmark for general Reinforcement learning algorithms. Our methodology extends previous recommendations and contains a complete set of environment parameters as well as train and test procedures. We then use SABER to evaluate the current state of the art, Rainbow. Furthermore, we introduce a human world records baseline, and argue that previous claims of expert or superhuman performance of DRL might not be accurate. Finally, we propose Rainbow-IQN by extending Rainbow with Implicit Quantile Networks (IQN) leading to new state-of-the-art performance. Source code is available for reproducibility.
Liste complète des métadonnées

Littérature citée [34 références]  Voir  Masquer  Télécharger

https://hal-mines-paristech.archives-ouvertes.fr/hal-02368263
Contributeur : Fabien Moutarde <>
Soumis le : mardi 19 novembre 2019 - 17:57:33
Dernière modification le : vendredi 22 novembre 2019 - 13:39:59

Fichier

IsDRL-reallySuperHuman-onAtari...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-02368263, version 1

Citation

Marin Toromanoff, Emilie Wirbel, Fabien Moutarde. Is Deep Reinforcement Learning Really Superhuman on Atari?. Deep Reinforcement Learning Workshop of 39th Conference on Neural Information Processing Systems (Neurips'2019), Dec 2019, Vancouver, Canada. ⟨hal-02368263⟩

Partager

Métriques

Consultations de la notice

532

Téléchargements de fichiers

11