HyperAI
HyperAI
الرئيسية
المنصة
الوثائق
الأخبار
الأوراق البحثية
الدروس
مجموعات البيانات
الموسوعة
SOTA
نماذج LLM
لوحة الأداء GPU
الفعاليات
البحث
حول
شروط الخدمة
سياسة الخصوصية
العربية
HyperAI
HyperAI
Toggle Sidebar
البحث في الموقع...
⌘
K
Command Palette
Search for a command to run...
المنصة
الرئيسية
SOTA
نموذج اللغة
Language Modelling On Wikitext 103
Language Modelling On Wikitext 103
المقاييس
Number of params
Test perplexity
النتائج
نتائج أداء النماذج المختلفة على هذا المعيار القياسي
Columns
اسم النموذج
Number of params
Test perplexity
Paper Title
LSTM
-
48.7
Improving Neural Language Models with a Continuous Cache
Temporal CNN
-
45.2
Convolutional Sequence Modeling Revisited
TCN
-
45.19
An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
GCNN-8
-
44.9
Language Modeling with Gated Convolutional Networks
Neural cache model (size = 100)
-
44.8
Improving Neural Language Models with a Continuous Cache
Neural cache model (size = 2,000)
-
40.8
Improving Neural Language Models with a Continuous Cache
GPT-2 Small
124M
37.50
Language Models are Unsupervised Multitask Learners
GCNN-8
-
37.2
Language Modeling with Gated Convolutional Networks
LSTM
-
36.4
Fast Parametric Learning with Activation Memorization
LSTM (Hebbian)
-
34.3
Fast Parametric Learning with Activation Memorization
4 layer QRNN
151M
33.0
An Analysis of Neural Language Modeling at Multiple Scales
AWD-LSTM-MoS + ATOI
-
32.85
Alleviating Sequence Information Loss with Data Overlapping and Prime Batch Sizes
DEQ-Transformer (small)
138M
32.4
Deep Equilibrium Models
LSTM (RMC)
-
31.6
Relational recurrent neural networks
Primal.+Trans.
-
31.0
Primal-Attention: Self-attention through Asymmetric Kernel SVD in Primal Representation
Rfa-Gate-Gaussian-Stateful (Small)
-
30.5
Random Feature Attention
LSTM (Hebbian, Cache)
-
29.7
Fast Parametric Learning with Activation Memorization
LSTM (Hebbian, Cache, MbPA)
-
29.2
Fast Parametric Learning with Activation Memorization
Trellis Network
-
29.19
Trellis Networks for Sequence Modeling
DEQ-TrellisNet
180M
29.0
Deep Equilibrium Models
0 of 89 row(s) selected.
Previous
Next
Language Modelling On Wikitext 103 | SOTA | HyperAI