HyperAI
HyperAI
الرئيسية
المنصة
الوثائق
الأخبار
الأوراق البحثية
الدروس
مجموعات البيانات
الموسوعة
SOTA
نماذج LLM
لوحة الأداء GPU
الفعاليات
البحث
حول
شروط الخدمة
سياسة الخصوصية
العربية
HyperAI
HyperAI
Toggle Sidebar
البحث في الموقع...
⌘
K
Command Palette
Search for a command to run...
المنصة
الرئيسية
SOTA
نموذج اللغة
Language Modelling On Text8
Language Modelling On Text8
المقاييس
Bit per Character (BPC)
النتائج
نتائج أداء النماذج المختلفة على هذا المعيار القياسي
Columns
اسم النموذج
Bit per Character (BPC)
Paper Title
td-LSTM (Zhang et al., 2016)
1.63
Architectural Complexity Measures of Recurrent Neural Networks
td-LSTM-large
1.49
Architectural Complexity Measures of Recurrent Neural Networks
BFN
1.41
Bayesian Flow Networks
Unregularised mLSTM
1.40
Multiplicative LSTM for sequence modelling
BN LSTM
1.36
Recurrent Batch Normalization
LayerNorm HM-LSTM
1.29
Hierarchical Multiscale Recurrent Neural Networks
Large mLSTM +emb +WN +VD
1.27
Multiplicative LSTM for sequence modelling
Large RHN
1.27
Recurrent Highway Networks
Bipartite flows (8 flows)
1.23
Discrete Flows: Invertible Generative Models of Discrete Data
mLSTM + dynamic eval
1.19
Dynamic Evaluation of Neural Sequence Models
12-layer Character Transformer Model
1.18
Character-Level Language Modeling with Deeper Self-Attention
PAR Transformer 24B
1.18
Pay Attention when Required
GAM-RHN-10
1.157
Recurrent Highway Networks with Grouped Auxiliary Memory
64-layer Character Transformer Model
1.13
Character-Level Language Modeling with Deeper Self-Attention
12L Transformer + 8K adaptive span
1.11
Adaptive Attention Span in Transformers
BP-Transformer - 12 Layers
1.11
BP-Transformer: Modelling Long-Range Context via Binary Partitioning
All-attention network - 18 layers
1.11
Augmenting Self-attention with Persistent Memory
Transformer-LS (small)
1.09
Long-Short Transformer: Efficient Transformers for Language and Vision
All-attention network - 36 layers
1.08
Augmenting Self-attention with Persistent Memory
Transformer-XL - 24 layers
1.08
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
0 of 24 row(s) selected.
Previous
Next
Language Modelling On Text8 | SOTA | HyperAI