Stereotypical Bias Analysis On Crows Pairs
評価指標
Age
Disability
Gender
Nationality
Overall
Physical Appearance
Race/Color
Religion
Sexual Orientation
Socioeconomic status
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
比較表
モデル名 | Age | Disability | Gender | Nationality | Overall | Physical Appearance | Race/Color | Religion | Sexual Orientation | Socioeconomic status |
---|---|---|---|---|---|---|---|---|---|---|
opt-open-pre-trained-transformer-language | 64.4 | 76.7 | 62.6 | 61.6 | 67.2 | 74.6 | 64.7 | 62.6 | 76.2 | 73.8 |
galactica-a-large-language-model-for-science-1 | 69 | 66.7 | 51.9 | 51.6 | 60.5 | 58.7 | 59.9 | 51.9 | 77.4 | 65.7 |
llama-open-and-efficient-foundation-language-1 | 70.1 | 66.7 | 70.6 | 64.2 | 66.6 | 77.8 | 57.0 | 70.6 | 81.0 | 71.5 |
opt-open-pre-trained-transformer-language | 67.8 | 76.7 | 65.7 | 62.9 | 69.5 | 76.2 | 68.6 | 65.7 | 78.6 | 76.2 |