3 months ago

Guoxin Chen Jing Wu Xinjie Chen Wayne Xin Zhao Ruihua Song Chengxi Li Kai Fan Dayiheng Liu Minpeng Liao

Abstract

Autoformalization, which translates natural language mathematics into machine-verifiable formal statements, is critical for using formal mathematical reasoning to solve math problems stated in natural language. While Large Language Models can generate syntactically correct formal statements, they often fail to preserve the original problem's semantic intent. This limitation arises from the LLM approaches' treating autoformalization as a simplistic translation task which lacks mechanisms for self-reflection and iterative refinement that human experts naturally employ. To address these issues, we propose ReForm, a Reflective Autoformalization method that tightly integrates semantic consistency evaluation into the autoformalization process. This enables the model to iteratively generate formal statements, assess its semantic fidelity, and self-correct identified errors through progressive refinement. To effectively train this reflective model, we introduce Prospective Bounded Sequence Optimization (PBSO), which employs different rewards at different sequence positions to ensure that the model develops both accurate autoformalization and correct semantic validations, preventing superficial critiques that would undermine the purpose of reflection. Extensive experiments across four autoformalization benchmarks demonstrate that ReForm achieves an average improvement of 17.2 percentage points over the strongest baselines. To further ensure evaluation reliability, we introduce ConsistencyCheck, a benchmark of 859 expert-annotated items that not only validates LLMs as judges but also reveals that autoformalization is inherently difficult: even human experts produce semantic errors in up to 38.5% of cases.

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

3 months ago

LLM

Benchmarks

Supervised Fine-Tuning

AI Infra

Method/Architecture

Guoxin Chen Jing Wu Xinjie Chen Wayne Xin Zhao Ruihua Song Chengxi Li Kai Fan Dayiheng Liu Minpeng Liao

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

3 months ago

LLM

Benchmarks

Supervised Fine-Tuning

AI Infra

Method/Architecture

Guoxin Chen Jing Wu Xinjie Chen Wayne Xin Zhao Ruihua Song Chengxi Li Kai Fan Dayiheng Liu Minpeng Liao

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

ReForm: Reflective Autoformalization with Prospective Bounded Sequence Optimization | Papers | HyperAI

Command Palette

ReForm: Reflective Autoformalization with Prospective Bounded Sequence Optimization

Guoxin Chen Jing Wu Xinjie Chen Wayne Xin Zhao Ruihua Song Chengxi Li Kai Fan Dayiheng Liu Minpeng Liao

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

ReForm: Reflective Autoformalization with Prospective Bounded Sequence Optimization

Guoxin Chen Jing Wu Xinjie Chen Wayne Xin Zhao Ruihua Song Chengxi Li Kai Fan Dayiheng Liu Minpeng Liao

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

ReForm: Reflective Autoformalization with Prospective Bounded Sequence Optimization

Guoxin Chen Jing Wu Xinjie Chen Wayne Xin Zhao Ruihua Song Chengxi Li Kai Fan Dayiheng Liu Minpeng Liao

Abstract

Build AI with AI

HyperAI Newsletters