HyperAI
HyperAI초신경
홈
플랫폼
문서
뉴스
연구 논문
튜토리얼
데이터셋
백과사전
SOTA
LLM 모델
GPU 랭킹
컨퍼런스
전체 검색
소개
서비스 약관
개인정보 처리방침
한국어
HyperAI
HyperAI초신경
Toggle Sidebar
전체 사이트 검색...
⌘
K
Command Palette
Search for a command to run...
플랫폼
홈
SOTA
언어모델링
Language Modelling On Hutter Prize
Language Modelling On Hutter Prize
평가 지표
Bit per Character (BPC)
Number of params
평가 결과
이 벤치마크에서 각 모델의 성능 결과
Columns
모델 이름
Bit per Character (BPC)
Number of params
Paper Title
RHN - depth 5 [zilly2016recurrent]
1.31
-
Recurrent Highway Networks
FS-LSTM-4
1.277
27M
Fast-Slow Recurrent Neural Networks
Large RHN
1.27
46M
Recurrent Highway Networks
Large FS-LSTM-4
1.245
47M
Fast-Slow Recurrent Neural Networks
Large mLSTM +emb +WN +VD
1.24
46M
Multiplicative LSTM for sequence modelling
3-layer AWD-LSTM
1.232
47M
An Analysis of Neural Language Modeling at Multiple Scales
Mogrifier LSTM
1.122
96M
Mogrifier LSTM
12-layer Character Transformer Model
1.11
44M
Character-Level Language Modeling with Deeper Self-Attention
mLSTM + dynamic eval
1.08
46M
Dynamic Evaluation of Neural Sequence Models
12-layer Transformer-XL
1.06
41M
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
64-layer Character Transformer Model
1.06
235M
Character-Level Language Modeling with Deeper Self-Attention
18-layer Transformer-XL
1.03
88M
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Longformer Small
1.00
41M
Longformer: The Long-Document Transformer
Longformer Large
0.99
102M
Longformer: The Long-Document Transformer
24-layer Transformer-XL
0.99
277M
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Mogrifier LSTM + dynamic eval
0.988
96M
Mogrifier LSTM
Compressive Transformer
0.97
-
Compressive Transformers for Long-Range Sequence Modelling
Transformer-XL + RMS dynamic eval
0.94
277M
Dynamic Evaluation of Transformer Language Models
0 of 18 row(s) selected.
Previous
Next
Language Modelling On Hutter Prize | SOTA | HyperAI초신경