Atari Games On Atari 2600 Beam Rider
Metriken
Score
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Vergleichstabelle
Modellname | Score |
---|---|
asynchronous-methods-for-deep-reinforcement | 24622.2 |
prioritized-experience-replay | 23384.2 |
impala-scalable-distributed-deep-rl-with | 32463.47 |
deep-reinforcement-learning-with-double-q | 9743.2 |
learning-values-across-many-orders-of | 8299.4 |
dueling-network-architectures-for-deep | 12164.0 |
human-level-control-through-deep | 6846.0 |
policy-optimization-with-penalized-point | 4549 |
the-arcade-learning-environment-an-evaluation | 929.4 |
playing-atari-with-deep-reinforcement | 5184 |
Modell 11 | 1743.0 |
mastering-atari-with-discrete-world-models-1 | 18646 |
massively-parallel-methods-for-deep | 3822.1 |
recurrent-independent-mechanisms | 5320 |
dueling-network-architectures-for-deep | 30276.5 |
asynchronous-methods-for-deep-reinforcement | 13235.9 |
evolving-simple-programs-for-playing-atari | 1341.6 |
the-arcade-learning-environment-an-evaluation | 6624.6 |
iq-learn-inverse-soft-q-learning-for | - |
online-and-offline-reinforcement-learning-by | 333077.44 |
agent57-outperforming-the-atari-human | 300509.8 |
soft-actor-critic-for-discrete-action | 432.1 |
deep-reinforcement-learning-with-double-q | 37412.2 |
a-distributional-perspective-on-reinforcement | 14074.0 |
noisy-networks-for-exploration | 23134 |
asynchronous-methods-for-deep-reinforcement | 22707.9 |
recurrent-experience-replay-in-distributed | 188257.4 |
deep-reinforcement-learning-with-double-q | 17417.2 |
dueling-network-architectures-for-deep | 13772.8 |
distributional-reinforcement-learning-with-1 | 34821 |
mastering-atari-go-chess-and-shogi-by | 454993.53 |
deep-reinforcement-learning-with-double-q | 8627.5 |
evolution-strategies-as-a-scalable | 744.0 |
dueling-network-architectures-for-deep | 14591.3 |
distributed-deep-reinforcement-learning-learn | 14900 |
generalized-data-distribution-iteration | 162100 |
mean-actor-critic | 6072 |
deep-exploration-via-bootstrapped-dqn | 23429.8 |
increasing-the-action-gap-new-operators-for | 13145.34 |
implicit-quantile-networks-for-distributional | 42776 |
generalized-data-distribution-iteration | 422890 |
train-a-real-world-local-path-planner-in-one | 26841.6 |
the-reactor-a-fast-and-sample-efficient-actor | 11033.4 |
increasing-the-action-gap-new-operators-for | 10054.58 |
dna-proximal-policy-optimization-with-a-dual | 20393 |
gdi-rethinking-what-makes-reinforcement | 162100 |
self-imitation-learning | 2366.2 |
distributed-prioritized-experience-replay | 63305.2 |
prioritized-experience-replay | 31181.3 |