Atari Games On Atari 2600 Pong
评估指标
Score
评测结果
各个模型在此基准测试上的表现结果
比较表格
模型名称 | Score |
---|---|
prioritized-experience-replay | 18.9 |
evolution-strategies-as-a-scalable | 21.0 |
evolving-simple-programs-for-playing-atari | 20 |
noisy-networks-for-exploration | 21 |
discrete-latent-space-world-models-for | 20.2 |
recurrent-rational-networks | 18.13 |
mean-actor-critic | 10.6 |
dueling-network-architectures-for-deep | 20.9 |
mastering-atari-with-discrete-world-models-1 | 20 |
a-distributional-perspective-on-reinforcement | 20.9 |
dna-proximal-policy-optimization-with-a-dual | 19.7 |
playing-atari-with-deep-reinforcement | 21 |
human-level-control-through-deep | 18.9 |
massively-parallel-methods-for-deep | 16.7 |
online-and-offline-reinforcement-learning-by | 20.95 |
curl-contrastive-unsupervised-representations | 2.1 |
deep-reinforcement-learning-with-double-q | 19.5 |
increasing-the-action-gap-new-operators-for | 19.66 |
distributed-prioritized-experience-replay | 20.9 |
generalized-data-distribution-iteration | 21 |
asynchronous-methods-for-deep-reinforcement | 11.4 |
asynchronous-methods-for-deep-reinforcement | 10.7 |
train-a-real-world-local-path-planner-in-one | 21 |
模型 24 | -17.4 |
impala-scalable-distributed-deep-rl-with | 20.98 |
increasing-the-action-gap-new-operators-for | 19.76 |
asynchronous-methods-for-deep-reinforcement | 5.6 |
generalized-data-distribution-iteration | 21.0 |
recurrent-rational-networks | 18.04 |
the-arcade-learning-environment-an-evaluation | -19 |
implicit-quantile-networks-for-distributional | 21 |
deep-reinforcement-learning-with-double-q | 18.0 |
distributed-deep-reinforcement-learning-learn | 20 |
agent57-outperforming-the-atari-human | 20.67 |
recurrent-experience-replay-in-distributed | 21.0 |
generalized-data-distribution-iteration | 21 |
dueling-network-architectures-for-deep | 21.0 |
self-imitation-learning | 20.9 |
dueling-network-architectures-for-deep | 18.8 |
dueling-network-architectures-for-deep | 20.9 |
deep-exploration-via-bootstrapped-dqn | 20.9 |
soft-actor-critic-for-discrete-action | -20.98 |
prioritized-experience-replay | 20.6 |
deep-reinforcement-learning-with-double-q | 18.4 |
generalized-data-distribution-iteration | 21.0 |
deep-reinforcement-learning-with-double-q | 19.1 |
policy-optimization-with-penalized-point | 20.5 |
distributional-reinforcement-learning-with-1 | 21 |
the-arcade-learning-environment-an-evaluation | 21 |
mastering-atari-go-chess-and-shogi-by | 21.00 |
learning-values-across-many-orders-of | 20.6 |
decision-transformer-reinforcement-learning | 17.1 |