4 months ago

Zichen Wen Shaobo Wang Yufa Zhou Junyuan Zhang Qintong Zhang Yifeng Gao Zhaorun Chen Bin Wang Weijia Li Conghui He

Abstract

Visual tokens consume substantial computational resources in multi-modal large models (MLLMs), significantly compromising their efficiency. Recent works have attempted to improve efficiency by compressing visual tokens during training, either through modifications to model components or by introducing additional parameters. However, they often overlook the increased learning difficulty caused by such compression, as the model's parameter space struggles to quickly adapt to the substantial perturbations in the feature space induced by token compression. In this work, we propose to develop Efficient MLLMs via Progressive Consistency Distillation (EPIC), a progressive learning framework. Specifically, by decomposing the feature space perturbations introduced by token compression along the token-wise and layer-wise dimensions, we introduce token consistency distillation and layer consistency distillation, respectively, aiming to reduce the training difficulty by leveraging guidance from a teacher model and following a progressive learning trajectory. Extensive experiments demonstrate the superior effectiveness, robustness, and generalization capabilities of our proposed framework.

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

4 months ago

Multimodal

Transformer

Multimodal Representation

Method/Architecture

Multimodality

Task/Problem

Zichen Wen Shaobo Wang Yufa Zhou Junyuan Zhang Qintong Zhang Yifeng Gao Zhaorun Chen Bin Wang Weijia Li Conghui He

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

4 months ago

Multimodal

Transformer

Multimodal Representation

Method/Architecture

Multimodality

Task/Problem

Zichen Wen Shaobo Wang Yufa Zhou Junyuan Zhang Qintong Zhang Yifeng Gao Zhaorun Chen Bin Wang Weijia Li Conghui He

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation | Papers | HyperAI

Command Palette

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

Zichen Wen Shaobo Wang Yufa Zhou Junyuan Zhang Qintong Zhang Yifeng Gao Zhaorun Chen Bin Wang Weijia Li Conghui He1 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

Zichen Wen Shaobo Wang Yufa Zhou Junyuan Zhang Qintong Zhang Yifeng Gao Zhaorun Chen Bin Wang Weijia Li Conghui He1 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

Zichen Wen Shaobo Wang Yufa Zhou Junyuan Zhang Qintong Zhang Yifeng Gao Zhaorun Chen Bin Wang Weijia Li Conghui He1 more

Abstract

Build AI with AI

HyperAI Newsletters

Zichen Wen Shaobo Wang Yufa Zhou Junyuan Zhang Qintong Zhang Yifeng Gao Zhaorun Chen Bin Wang Weijia Li Conghui He

Zichen Wen Shaobo Wang Yufa Zhou Junyuan Zhang Qintong Zhang Yifeng Gao Zhaorun Chen Bin Wang Weijia Li Conghui He

Zichen Wen Shaobo Wang Yufa Zhou Junyuan Zhang Qintong Zhang Yifeng Gao Zhaorun Chen Bin Wang Weijia Li Conghui He