Language Modelling On Text8
평가 지표
Bit per Character (BPC)
평가 결과
이 벤치마크에서 각 모델의 성능 결과
비교 표
모델 이름 | Bit per Character (BPC) |
---|---|
long-short-transformer-efficient-transformers | 1.09 |
2305-14952 | 0.98 |
language-models-are-unsupervised-multitask | 0.98 |
augmenting-self-attention-with-persistent | 1.08 |
adaptive-attention-span-in-transformers | 1.11 |
bp-transformer-modelling-long-range-context | 1.11 |
character-level-language-modeling-with-deeper | 1.18 |
augmenting-self-attention-with-persistent | 1.11 |
architectural-complexity-measures-of | 1.63 |
dynamic-evaluation-of-neural-sequence-models | 1.19 |
multiplicative-lstm-for-sequence-modelling | 1.27 |
hierarchical-multiscale-recurrent-neural | 1.29 |
architectural-complexity-measures-of | 1.49 |
recurrent-highway-networks-with-grouped | 1.157 |
multiplicative-lstm-for-sequence-modelling | 1.40 |
dynamic-evaluation-of-transformer-language | 1.038 |
discrete-flows-invertible-generative-models | 1.23 |
bayesian-flow-networks | 1.41 |
adaptive-attention-span-in-transformers | 1.07 |
character-level-language-modeling-with-deeper | 1.13 |
recurrent-highway-networks | 1.27 |
transformer-xl-attentive-language-models | 1.08 |
recurrent-batch-normalization | 1.36 |
pay-attention-when-required | 1.18 |