Atari Games On Atari 2600 Centipede
평가 지표
Score
평가 결과
이 벤치마크에서 각 모델의 성능 결과
비교 표
모델 이름 | Score |
---|---|
gdi-rethinking-what-makes-reinforcement | 155830 |
self-imitation-learning | 7559.5 |
deep-exploration-via-bootstrapped-dqn | 4553.5 |
deep-reinforcement-learning-with-double-q | 4657.7 |
deep-reinforcement-learning-with-double-q | 5570.2 |
generalized-data-distribution-iteration | 155830 |
online-and-offline-reinforcement-learning-by | 874301.64 |
evolution-strategies-as-a-scalable | 7783.9 |
dueling-network-architectures-for-deep | 4881.0 |
asynchronous-methods-for-deep-reinforcement | 3306.5 |
first-return-then-explore | 1422628 |
asynchronous-methods-for-deep-reinforcement | 1997.0 |
a-distributional-perspective-on-reinforcement | 9646.0 |
recurrent-experience-replay-in-distributed | 599140.3 |
deep-reinforcement-learning-with-double-q | 3973.9 |
increasing-the-action-gap-new-operators-for | 4539.55 |
deep-reinforcement-learning-with-double-q | 3853.5 |
massively-parallel-methods-for-deep | 6296.9 |
모델 19 | 4647.0 |
generalized-data-distribution-iteration | 195630 |
dueling-network-architectures-for-deep | 5409.4 |
mastering-atari-go-chess-and-shogi-by | 1159049.27 |
impala-scalable-distributed-deep-rl-with | 11049.75 |
evolving-simple-programs-for-playing-atari | 24708 |
gdi-rethinking-what-makes-reinforcement-1 | 1359533 |
noisy-networks-for-exploration | 7596 |
asynchronous-methods-for-deep-reinforcement | 3755.8 |
dueling-network-architectures-for-deep | 7561.4 |
distributed-prioritized-experience-replay | 12974 |
implicit-quantile-networks-for-distributional | 11561 |
train-a-real-world-local-path-planner-in-one | 3899.8 |
agent57-outperforming-the-atari-human | 412847.86 |
policy-optimization-with-penalized-point | 3315.44 |
learning-values-across-many-orders-of | 49065.8 |
mastering-atari-with-discrete-world-models-1 | 11883 |
the-arcade-learning-environment-an-evaluation | 125123 |
distributional-reinforcement-learning-with-1 | 12447 |
the-reactor-a-fast-and-sample-efficient-actor | 3422.0 |
dueling-network-architectures-for-deep | 7687.5 |
prioritized-experience-replay | 4463.2 |
the-arcade-learning-environment-an-evaluation | 8803.8 |
dna-proximal-policy-optimization-with-a-dual | 100194 |
increasing-the-action-gap-new-operators-for | 4225.18 |
prioritized-experience-replay | 3489.1 |
human-level-control-through-deep | 8309.0 |