EX-4D: Generate Free View From Monocular Video
1. Tutorial Introduction

EX-4D is a new 4D video generation framework launched by the Pico team under ByteDance on July 3, 2025. It can generate high-quality 4D videos at extreme perspectives from monocular video input. The framework is based on a unique deep waterproof mesh (DW-Mesh) representation, which explicitly models visible and occluded areas to ensure geometric consistency under extreme camera poses. The framework uses a simulated occlusion mask strategy to generate effective training data based on monocular video, and uses a lightweight LoRA-based video diffusion adapter to synthesize physically consistent and temporally coherent videos. EX-4D performs significantly better than existing methods at extreme perspectives, providing a new solution for 4D video generation. The related paper results are "EX-4D: EXtreme Viewpoint 4D Video Synthesis via Depth Watertight Mesh".
This tutorial uses a single RTX A6000 card as the resource.
2. Project Examples

3. Operation steps
1. After starting the container, click the API address to enter the Web interface

2. Usage steps
If "Bad Gateway" is displayed, it means the model is initializing. Since the model is large, please wait about 2-3 minutes and refresh the page.

Parameter Description
- Camera Angle: Camera angle, 30°-180°. The larger the angle, the wider the field of view.
- Frame Count: Number of video frames.
- Inference Steps: Inference steps.
- Random Seed: Random seed.
4. Discussion
🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [SD Tutorial] to join the group to discuss various technical issues and share application effects↓

Citation Information
The citation information for this project is as follows:
@misc{hu2025ex4dextremeviewpoint4d,
title={EX-4D: EXtreme Viewpoint 4D Video Synthesis via Depth Watertight Mesh},
author={Tao Hu and Haoyang Peng and Xiao Liu and Yuewen Ma},
year={2025},
eprint={2506.05554},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2506.05554},
}