Atari Games On Atari 2600 Tutankham
Metriken
Score
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Vergleichstabelle
Modellname | Score |
---|---|
deep-reinforcement-learning-with-double-q | 45.6 |
Modell 2 | 98.2 |
dueling-network-architectures-for-deep | 218.4 |
prioritized-experience-replay | 204.6 |
generalized-data-distribution-iteration | 423.9 |
recurrent-experience-replay-in-distributed | 395.3 |
mastering-atari-go-chess-and-shogi-by | 491.48 |
dueling-network-architectures-for-deep | 245.9 |
noisy-networks-for-exploration | 269 |
evolving-simple-programs-for-playing-atari | 0 |
learning-values-across-many-orders-of | 183.9 |
gdi-rethinking-what-makes-reinforcement | 423.9 |
deep-reinforcement-learning-with-double-q | 92.2 |
asynchronous-methods-for-deep-reinforcement | 144.2 |
generalized-data-distribution-iteration | 418.2 |
a-distributional-perspective-on-reinforcement | 280.0 |
increasing-the-action-gap-new-operators-for | 245.22 |
deep-reinforcement-learning-with-double-q | 68.1 |
the-arcade-learning-environment-an-evaluation | 114.3 |
train-a-real-world-local-path-planner-in-one | 252.9 |
agent57-outperforming-the-atari-human | 2354.91 |
dueling-network-architectures-for-deep | 48.0 |
dna-proximal-policy-optimization-with-a-dual | 127 |
deep-reinforcement-learning-with-double-q | 108.6 |
impala-scalable-distributed-deep-rl-with | 292.11 |
massively-parallel-methods-for-deep | 118.5 |
self-imitation-learning | 340.5 |
distributed-prioritized-experience-replay | 272.6 |
evolution-strategies-as-a-scalable | 130.3 |
prioritized-experience-replay | 56.9 |
deep-attention-recurrent-q-network | 197 |
distributional-reinforcement-learning-with-1 | 297 |
the-arcade-learning-environment-an-evaluation | 225.5 |
dueling-network-architectures-for-deep | 211.4 |
recurrent-rational-networks | 184 |
human-level-control-through-deep | 186.7 |
deep-exploration-via-bootstrapped-dqn | 214.8 |
asynchronous-methods-for-deep-reinforcement | 156.3 |
mastering-atari-with-discrete-world-models-1 | 264 |
implicit-quantile-networks-for-distributional | 293 |
asynchronous-methods-for-deep-reinforcement | 26.1 |
policy-optimization-with-penalized-point | 241.21 |
recurrent-rational-networks | 179 |
online-and-offline-reinforcement-learning-by | 347.99 |