HyperAI超神经

Conversational Web Navigation On Weblinx

评估指标

Element (IoU)
Intent Match
Overall score
Text (F1)

评测结果

各个模型在此基准测试上的表现结果

比较表格
模型名称Element (IoU)Intent MatchOverall scoreText (F1)
weblinx-real-world-website-navigation-with8.6242.778.513.45
weblinx-real-world-website-navigation-with20.5483.3223.7325.85
weblinx-real-world-website-navigation-with8.2881.8016.8825.21
weblinx-real-world-website-navigation-with16.5079.8920.9423.16
weblinx-real-world-website-navigation-with15.7080.0719.9722.30
weblinx-real-world-website-navigation-with22.8281.9125.2126.60
weblinx-real-world-website-navigation-with18.6477.5621.2222.39
weblinx-real-world-website-navigation-with13.3975.8715.1313.58
weblinx-real-world-website-navigation-with15.3680.0217.2714.05
weblinx-real-world-website-navigation-with12.0574.2512.637.67
weblinx-real-world-website-navigation-with6.2079.7112.5116.40
weblinx-real-world-website-navigation-with22.6084.0025.0227.17
weblinx-real-world-website-navigation-with10.8541.6610.726.75
weblinx-real-world-website-navigation-with14.8679.6914.999.21
weblinx-real-world-website-navigation-with20.3181.1423.7725.75
weblinx-real-world-website-navigation-with10.9142.3610.456.21
weblinx-real-world-website-navigation-with22.2682.6424.5726.50