Atari Games On Atari 2600 Bank Heist
評価指標
Score
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
比較表
モデル名 | Score |
---|---|
deep-reinforcement-learning-with-double-q | 1004.6 |
improving-computational-efficiency-in-visual | 276.6 |
noisy-networks-for-exploration | 1318 |
online-and-offline-reinforcement-learning-by | 27219.8 |
the-arcade-learning-environment-an-evaluation | 190.8 |
dna-proximal-policy-optimization-with-a-dual | 1286 |
dueling-network-architectures-for-deep | 1030.6 |
massively-parallel-methods-for-deep | 399.4 |
deep-reinforcement-learning-with-double-q | 312.7 |
dueling-network-architectures-for-deep | 1129.3 |
policy-optimization-with-penalized-point | 1212.23 |
mastering-atari-with-discrete-world-models-1 | 1126 |
learning-values-across-many-orders-of | 1103.3 |
モデル 14 | 67.4 |
deep-reinforcement-learning-with-double-q | 886.0 |
deep-exploration-via-bootstrapped-dqn | 1208 |
increasing-the-action-gap-new-operators-for | 633.63 |
curl-contrastive-unsupervised-representations | 193.7 |
discrete-latent-space-world-models-for | 121.6 |
implicit-quantile-networks-for-distributional | 1416 |
the-reactor-a-fast-and-sample-efficient-actor | 1259.7 |
agent57-outperforming-the-atari-human | 23071.5 |
a-distributional-perspective-on-reinforcement | 976.0 |
evolution-strategies-as-a-scalable | 225.0 |
distributed-prioritized-experience-replay | 1716.4 |
asynchronous-methods-for-deep-reinforcement | 932.8 |
train-a-real-world-local-path-planner-in-one | 1340.9 |
evolving-simple-programs-for-playing-atari | 148 |
the-arcade-learning-environment-an-evaluation | 497.8 |
human-level-control-through-deep | 429.7 |
prioritized-experience-replay | 876.6 |
impala-scalable-distributed-deep-rl-with | 1223.15 |
deep-reinforcement-learning-with-double-q | 455.0 |
mastering-atari-go-chess-and-shogi-by | 1278.98 |
generalized-data-distribution-iteration | 1380 |
dueling-network-architectures-for-deep | 1503.1 |
dueling-network-architectures-for-deep | 1611.9 |
distributional-reinforcement-learning-with-1 | 1249 |
asynchronous-methods-for-deep-reinforcement | 946.0 |
recurrent-experience-replay-in-distributed | 24235.9 |
generalized-data-distribution-iteration | 1401 |
prioritized-experience-replay | 1054.6 |
self-imitation-learning | 1137.8 |
increasing-the-action-gap-new-operators-for | 874.99 |
asynchronous-methods-for-deep-reinforcement | 970.1 |