Atari Games On Atari 2600 Gravitar
평가 지표
Score
평가 결과
이 벤치마크에서 각 모델의 성능 결과
비교 표
모델 이름 | Score |
---|---|
모델 1 | 429.0 |
evolving-simple-programs-for-playing-atari | 2350 |
human-level-control-through-deep | 306.7 |
evolution-strategies-as-a-scalable | 805.0 |
agent57-outperforming-the-atari-human | 19213.96 |
exploration-by-self-supervised-exploitation | 4643 |
mastering-atari-go-chess-and-shogi-by | 6682.70 |
exploration-by-self-supervised-exploitation | 2741 |
exploration-by-self-supervised-exploitation | 6712 |
self-imitation-learning | 1874.2 |
asynchronous-methods-for-deep-reinforcement | 320.0 |
gdi-rethinking-what-makes-reinforcement | 5905 |
count-based-exploration-with-the-successor | 1078.3 |
learning-values-across-many-orders-of | 483.5 |
dueling-network-architectures-for-deep | 588.0 |
online-and-offline-reinforcement-learning-by | 8006.93 |
deep-reinforcement-learning-with-double-q | 298.0 |
impala-scalable-distributed-deep-rl-with | 359.50 |
a-distributional-perspective-on-reinforcement | 440.0 |
generalized-data-distribution-iteration | 5915 |
increasing-the-action-gap-new-operators-for | 446.92 |
the-arcade-learning-environment-an-evaluation | 2850 |
exploration-by-random-network-distillation | 3906 |
the-arcade-learning-environment-an-evaluation | 387.7 |
count-based-exploration-with-neural-density | 238.0 |
dueling-network-architectures-for-deep | 297.0 |
unifying-count-based-exploration-and | 238.68 |
dueling-network-architectures-for-deep | 412.0 |
generalized-data-distribution-iteration | 5905 |
increasing-the-action-gap-new-operators-for | 417.65 |
prioritized-experience-replay | 548.5 |
distributed-prioritized-experience-replay | 1598.5 |
large-scale-study-of-curiosity-driven | 1165.1 |
recurrent-experience-replay-in-distributed | 15680.7 |
mastering-atari-with-discrete-world-models-1 | 3789 |
dueling-network-architectures-for-deep | 238.0 |
dna-proximal-policy-optimization-with-a-dual | 2190 |
policy-optimization-with-penalized-point | 557.17 |
deep-exploration-via-bootstrapped-dqn | 286.1 |
prioritized-experience-replay | 269.5 |
distributional-reinforcement-learning-with-1 | 995 |
asynchronous-methods-for-deep-reinforcement | 303.5 |
first-return-then-explore | 7588 |
noisy-networks-for-exploration | 2209 |
implicit-quantile-networks-for-distributional | 911 |
count-based-exploration-with-neural-density | 498.3 |
deep-reinforcement-learning-with-double-q | 473.0 |
train-a-real-world-local-path-planner-in-one | 760 |
deep-reinforcement-learning-with-double-q | 200.5 |
asynchronous-methods-for-deep-reinforcement | 269.5 |
deep-reinforcement-learning-with-double-q | 167.0 |
massively-parallel-methods-for-deep | 538.4 |
fully-parameterized-quantile-function-for | 1406.0 |