Language Modelling On Wiki 40B

평가 지표

Perplexity

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름
Perplexity
Paper TitleRepository
FLASH-Quad-8k14.998Transformer Quality in Linear Time-
Combiner-Axial-8k16.49Combiner: Full Attention Transformer with Sparse Computation Cost-
Combiner-Fixed-8k16.60Combiner: Full Attention Transformer with Sparse Computation Cost-
0 of 3 row(s) selected.
Language Modelling On Wiki 40B | SOTA | HyperAI초신경