Question Answering On Uniprotqa
評価指標
BLEU-2
BLEU-4
MEATOR
ROUGE-1
ROUGE-2
ROUGE-L
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
モデル名 | BLEU-2 | BLEU-4 | MEATOR | ROUGE-1 | ROUGE-2 | ROUGE-L | Paper Title | Repository |
---|---|---|---|---|---|---|---|---|
Llama2-7B-chat | 0.019 | 0.002 | 0.052 | 0.103 | 0.060 | 0.009 | Llama 2: Open Foundation and Fine-Tuned Chat Models | |
BioMedGPT-10B | 0.571 | 0.535 | 0.754 | 0.743 | 0.759 | 0.622 | BioMedGPT: Open Multimodal Generative Pre-trained Transformer for BioMedicine |
0 of 2 row(s) selected.