HyperAI

Conversational Web Navigation On Weblinx

Metriken

Element (IoU)
Intent Match
Overall score
Text (F1)

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Modellname
Element (IoU)
Intent Match
Overall score
Text (F1)
Paper TitleRepository
GPT-3.5T (Zero-Shot)8.6242.778.513.45WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
S-LLaMA-1.3B20.5483.3223.7325.85WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Pix2Act-1.3B8.2881.8016.8825.21WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
MindAct-3B16.5079.8920.9423.16WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Fuyu-8B15.7080.0719.9722.30WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Llama-2-13B22.8281.9125.2126.60WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
GPT-3.5F18.6477.5621.2222.39WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
MindAct-780M13.3975.8715.1313.58WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Flan-T5-780M15.3680.0217.2714.05WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
MindAct-250M12.0574.2512.637.67WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Pix2Act-282M6.2079.7112.5116.40WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
S-LLaMA-2.7B22.6084.0025.0227.17WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
GPT-4T (Zero-Shot)10.8541.6610.726.75WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Flan-T5-250M14.8679.6914.999.21WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Flan-T5-3B20.3181.1423.7725.75WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
GPT-4V (Zero-Shot)10.9142.3610.456.21WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Llama-2-7B22.2682.6424.5726.50WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
0 of 17 row(s) selected.