HyperAI
HyperAI
الرئيسية
المنصة
الوثائق
الأخبار
الأوراق البحثية
الدروس
مجموعات البيانات
الموسوعة
SOTA
نماذج LLM
لوحة الأداء GPU
الفعاليات
البحث
حول
شروط الخدمة
سياسة الخصوصية
العربية
HyperAI
HyperAI
Toggle Sidebar
البحث في الموقع...
⌘
K
Command Palette
Search for a command to run...
المنصة
الرئيسية
SOTA
نموذج اللغة
Language Modelling On Penn Treebank Word
Language Modelling On Penn Treebank Word
المقاييس
Params
Test perplexity
Validation perplexity
النتائج
نتائج أداء النماذج المختلفة على هذا المعيار القياسي
Columns
اسم النموذج
Params
Test perplexity
Validation perplexity
Paper Title
TCN
14.7M
108.47
-
Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling
Seq-U-Net
14.9M
107.95
-
Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling
GRU (Bai et al., 2018)
-
92.48
-
An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
R-Transformer
-
84.38
-
R-Transformer: Recurrent Neural Network Enhanced Transformer
Zaremba et al. (2014) - LSTM (medium)
-
82.7
86.2
Recurrent Neural Network Regularization
Gal & Ghahramani (2016) - Variational LSTM (medium)
-
79.7
81.9
A Theoretically Grounded Application of Dropout in Recurrent Neural Networks
LSTM (Bai et al., 2018)
-
78.93
-
An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
Zaremba et al. (2014) - LSTM (large)
-
78.4
82.2
Recurrent Neural Network Regularization
Gal & Ghahramani (2016) - Variational LSTM (large)
-
75.2
77.9
A Theoretically Grounded Application of Dropout in Recurrent Neural Networks
Inan et al. (2016) - Variational RHN
-
66.0
68.1
Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling
Recurrent highway networks
23M
65.4
67.9
Recurrent Highway Networks
NAS-RL
25M
64.0
-
Neural Architecture Search with Reinforcement Learning
Efficient NAS
24M
58.6
60.8
Efficient Neural Architecture Search via Parameter Sharing
AWD-LSTM
24M
57.3
60.0
Regularizing and Optimizing LSTM Language Models
DEQ-TrellisNet
24M
57.1
-
Deep Equilibrium Models
AWD-LSTM 3-layer with Fraternal dropout
24M
56.8
58.9
Fraternal Dropout
Dense IndRNN
-
56.37
-
Deep Independently Recurrent Neural Network (IndRNN)
Differentiable NAS
23M
56.1
58.3
DARTS: Differentiable Architecture Search
AWD-LSTM-DRILL
24M
55.7
58.2
Deep Residual Output Layers for Neural Language Generation
2-layer skip-LSTM + dropout tuning
24M
55.3
57.1
Pushing the bounds of dropout
0 of 43 row(s) selected.
Previous
Next
Language Modelling On Penn Treebank Word | SOTA | HyperAI