Atari Games On Atari 2600 Up And Down
评估指标
Score
评测结果
各个模型在此基准测试上的表现结果
比较表格
模型名称 | Score |
---|---|
distributional-reinforcement-learning-with-1 | 71260 |
the-arcade-learning-environment-an-evaluation | 3532.7 |
evolution-strategies-as-a-scalable | 67974.0 |
soft-actor-critic-for-discrete-action | 250.7 |
deep-reinforcement-learning-with-double-q | 9989.9 |
implicit-quantile-networks-for-distributional | 88148 |
dueling-network-architectures-for-deep | 22972.2 |
asynchronous-methods-for-deep-reinforcement | 74705.7 |
mastering-atari-go-chess-and-shogi-by | 715545.61 |
policy-optimization-with-penalized-point | 242701.51 |
impala-scalable-distributed-deep-rl-with | 332546.75 |
human-level-control-through-deep | 8456.0 |
asynchronous-methods-for-deep-reinforcement | 54525.4 |
asynchronous-methods-for-deep-reinforcement | 105728.7 |
self-imitation-learning | 53314.6 |
mastering-atari-with-discrete-world-models-1 | 653662 |
increasing-the-action-gap-new-operators-for | 13909.74 |
evolving-simple-programs-for-playing-atari | 14524 |
dueling-network-architectures-for-deep | 44939.6 |
recurrent-experience-replay-in-distributed | 589226.9 |
dueling-network-architectures-for-deep | 33879.1 |
train-a-real-world-local-path-planner-in-one | 25127.4 |
deep-exploration-via-bootstrapped-dqn | 26231 |
dna-proximal-policy-optimization-with-a-dual | 291934 |
deep-reinforcement-learning-with-double-q | 8038.5 |
a-distributional-perspective-on-reinforcement | 15612.0 |
prioritized-experience-replay | 16154.1 |
deep-reinforcement-learning-with-double-q | 22681.3 |
agent57-outperforming-the-atari-human | 623805.73 |
noisy-networks-for-exploration | 61326 |
gdi-rethinking-what-makes-reinforcement | 986440 |
distributed-prioritized-experience-replay | 401884.3 |
learning-values-across-many-orders-of | 22474.4 |
dueling-network-architectures-for-deep | 24759.2 |
the-arcade-learning-environment-an-evaluation | 74473.6 |
prioritized-experience-replay | 12157.4 |
massively-parallel-methods-for-deep | 8747.7 |
模型 38 | 2449.0 |
generalized-data-distribution-iteration | 986440 |
generalized-data-distribution-iteration | 966590 |
curl-contrastive-unsupervised-representations | 2735.2 |
deep-reinforcement-learning-with-double-q | 19086.9 |
recurrent-independent-mechanisms | 390000 |
online-and-offline-reinforcement-learning-by | 634898.18 |