Language Modelling On Wiki 40B

Perplexity

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름	Perplexity	Paper Title
FLASH-Quad-8k	14.998	Transformer Quality in Linear Time
Combiner-Axial-8k	16.49	Combiner: Full Attention Transformer with Sparse Computation Cost
Combiner-Fixed-8k	16.60	Combiner: Full Attention Transformer with Sparse Computation Cost

0 of 3 row(s) selected.