7 months ago

Abstract

Recent studies have significantly improved the state-of-the-art on common-sense reasoning (CSR) benchmarks like the Winograd Schema Challenge (WSC) and SWAG. The question we ask in this paper is whether improved performance on these benchmarks represents genuine progress towards common-sense-enabled systems. We make case studies of both benchmarks and design protocols that clarify and qualify the results of previous work by analyzing threats to the validity of previous experimental designs. Our protocols account for several properties prevalent in common-sense benchmarks including size limitations, structural regularities, and variable instance difficulty.

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

7 months ago

Paul Trichelair; Ali Emami; Adam Trischler; Kaheer Suleman; Jackie Chi Kit Cheung

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

How Reasonable are Common-Sense Reasoning Tasks: A Case-Study on the Winograd Schema Challenge and SWAG

Paul Trichelair; Ali Emami; Adam Trischler; Kaheer Suleman; Jackie Chi Kit Cheung

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

How Reasonable are Common-Sense Reasoning Tasks: A Case-Study on the Winograd Schema Challenge and SWAG

Paul Trichelair; Ali Emami; Adam Trischler; Kaheer Suleman; Jackie Chi Kit Cheung

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

How Reasonable are Common-Sense Reasoning Tasks: A Case-Study on the Winograd Schema Challenge and SWAG

Paul Trichelair; Ali Emami; Adam Trischler; Kaheer Suleman; Jackie Chi Kit Cheung

Abstract

Build AI with AI

HyperAI Newsletters