8 months ago

Method/Architecture

Zhao Zelin ; Samel Karan ; Chen Binghong ; Song Le

Abstract

Programs, consisting of semantic and structural information, play animportant role in the communication between humans and agents. Towards learninggeneral program executors to unify perception, reasoning, and decision making,we formulate program-guided tasks which require learning to execute a givenprogram on the observed task specification. Furthermore, we propose theProgram-guided Transformer (ProTo), which integrates both semantic andstructural guidance of a program by leveraging cross-attention and maskedself-attention to pass messages between the specification and routines in theprogram. ProTo executes a program in a learned latent space and enjoys strongerrepresentation ability than previous neural-symbolic approaches. We demonstratethat ProTo significantly outperforms the previous state-of-the-art methods onGQA visual reasoning and 2D Minecraft policy learning datasets. Additionally,ProTo demonstrates better generalization to unseen, complex, and human-writtenprograms.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

8 months ago

Method/Architecture

Zhao Zelin ; Samel Karan ; Chen Binghong ; Song Le

Abstract

Programs, consisting of semantic and structural information, play animportant role in the communication between humans and agents. Towards learninggeneral program executors to unify perception, reasoning, and decision making,we formulate program-guided tasks which require learning to execute a givenprogram on the observed task specification. Furthermore, we propose theProgram-guided Transformer (ProTo), which integrates both semantic andstructural guidance of a program by leveraging cross-attention and maskedself-attention to pass messages between the specification and routines in theprogram. ProTo executes a program in a learned latent space and enjoys strongerrepresentation ability than previous neural-symbolic approaches. We demonstratethat ProTo significantly outperforms the previous state-of-the-art methods onGQA visual reasoning and 2D Minecraft policy learning datasets. Additionally,ProTo demonstrates better generalization to unseen, complex, and human-writtenprograms.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp