HyperAI
HyperAI超神経
ホーム
プラットフォーム
ドキュメント
ニュース
論文
チュートリアル
データセット
百科事典
SOTA
LLMモデル
GPU ランキング
学会
検索
サイトについて
利用規約
プライバシーポリシー
日本語
HyperAI
HyperAI超神経
Toggle Sidebar
サイトを検索…
⌘
K
Command Palette
Search for a command to run...
プラットフォーム
ホーム
SOTA
言語モデル
Language Modelling On Wikitext 103
Language Modelling On Wikitext 103
評価指標
Number of params
Test perplexity
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
Columns
モデル名
Number of params
Test perplexity
Paper Title
LSTM
-
48.7
Improving Neural Language Models with a Continuous Cache
Temporal CNN
-
45.2
Convolutional Sequence Modeling Revisited
TCN
-
45.19
An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
GCNN-8
-
44.9
Language Modeling with Gated Convolutional Networks
Neural cache model (size = 100)
-
44.8
Improving Neural Language Models with a Continuous Cache
Neural cache model (size = 2,000)
-
40.8
Improving Neural Language Models with a Continuous Cache
GPT-2 Small
124M
37.50
Language Models are Unsupervised Multitask Learners
GCNN-8
-
37.2
Language Modeling with Gated Convolutional Networks
LSTM
-
36.4
Fast Parametric Learning with Activation Memorization
LSTM (Hebbian)
-
34.3
Fast Parametric Learning with Activation Memorization
4 layer QRNN
151M
33.0
An Analysis of Neural Language Modeling at Multiple Scales
AWD-LSTM-MoS + ATOI
-
32.85
Alleviating Sequence Information Loss with Data Overlapping and Prime Batch Sizes
DEQ-Transformer (small)
138M
32.4
Deep Equilibrium Models
LSTM (RMC)
-
31.6
Relational recurrent neural networks
Primal.+Trans.
-
31.0
Primal-Attention: Self-attention through Asymmetric Kernel SVD in Primal Representation
Rfa-Gate-Gaussian-Stateful (Small)
-
30.5
Random Feature Attention
LSTM (Hebbian, Cache)
-
29.7
Fast Parametric Learning with Activation Memorization
LSTM (Hebbian, Cache, MbPA)
-
29.2
Fast Parametric Learning with Activation Memorization
Trellis Network
-
29.19
Trellis Networks for Sequence Modeling
DEQ-TrellisNet
180M
29.0
Deep Equilibrium Models
0 of 89 row(s) selected.
Previous
Next
Language Modelling On Wikitext 103 | SOTA | HyperAI超神経