Atari Games On Atari 2600 Freeway
المقاييس
Score
النتائج
نتائج أداء النماذج المختلفة على هذا المعيار القياسي
جدول المقارنة
اسم النموذج | Score |
---|---|
asynchronous-methods-for-deep-reinforcement | 0.1 |
deep-exploration-via-bootstrapped-dqn | 33.9 |
soft-actor-critic-for-discrete-action | 4.4 |
dueling-network-architectures-for-deep | 0.0 |
dueling-network-architectures-for-deep | 0.2 |
asynchronous-methods-for-deep-reinforcement | 0.1 |
large-scale-study-of-curiosity-driven | 32.8 |
policy-optimization-with-penalized-point | 21.21 |
massively-parallel-methods-for-deep | 10.2 |
discrete-latent-space-world-models-for | 29 |
first-return-then-explore | 34 |
prioritized-experience-replay | 28.9 |
count-based-exploration-with-neural-density | 31.7 |
distributed-prioritized-experience-replay | 33.7 |
learning-values-across-many-orders-of | 33.4 |
count-based-exploration-in-feature-space-for | 0.0 |
evolution-strategies-as-a-scalable | 31.0 |
increasing-the-action-gap-new-operators-for | 31.72 |
recurrent-experience-replay-in-distributed | 32.5 |
the-arcade-learning-environment-an-evaluation | 22.5 |
النموذج 21 | 19.7 |
unifying-count-based-exploration-and | 30.48 |
optimizing-the-neural-architecture-of | 22 |
evolving-simple-programs-for-playing-atari | 28.2 |
incentivizing-exploration-in-reinforcement | 27.0 |
mastering-atari-with-discrete-world-models-1 | 33 |
agent57-outperforming-the-atari-human | 32.59 |
dueling-network-architectures-for-deep | 33.0 |
deep-reinforcement-learning-with-double-q | 28.8 |
prioritized-experience-replay | 33.7 |
a-distributional-perspective-on-reinforcement | 33.9 |
distributional-reinforcement-learning-with-1 | 34 |
generalized-data-distribution-iteration | 34 |
mastering-atari-go-chess-and-shogi-by | 33.03 |
generalized-data-distribution-iteration | 34 |
curl-contrastive-unsupervised-representations | 27.9 |
count-based-exploration-with-the-successor | 29.5 |
the-arcade-learning-environment-an-evaluation | 0.4 |
human-level-control-through-deep | 30.3 |
count-based-exploration-with-neural-density | 33.0 |
deep-reinforcement-learning-with-double-q | 28.2 |
increasing-the-action-gap-new-operators-for | 32.3 |
self-imitation-learning | 32.2 |
asynchronous-methods-for-deep-reinforcement | 0.1 |
generalized-data-distribution-iteration | 34 |
count-based-exploration-in-feature-space-for | 29.9 |
dna-proximal-policy-optimization-with-a-dual | 33 |
online-and-offline-reinforcement-learning-by | 33.87 |
implicit-quantile-networks-for-distributional | 34 |
deep-reinforcement-learning-with-double-q | 26.9 |
dueling-network-architectures-for-deep | 33.3 |
deep-reinforcement-learning-with-double-q | 30.8 |
impala-scalable-distributed-deep-rl-with | 0.00 |
gdi-rethinking-what-makes-reinforcement | 34 |
optimizing-the-neural-architecture-of | 22 |
exploration-a-study-of-count-based | 34.0 |
train-a-real-world-local-path-planner-in-one | 33.9 |
noisy-networks-for-exploration | 34 |
the-arcade-learning-environment-an-evaluation | 19.1 |