Text To Video Generation On Ucf 101
المقاييس
FVD16
النتائج
نتائج أداء النماذج المختلفة على هذا المعيار القياسي
| Paper Title | ||
|---|---|---|
| MagicVideo (Zero-shot, 256x256) | 699 | MagicVideo: Efficient Video Generation With Latent Diffusion Models |
| Video LDM (Zero-shot, 320x512) | 550.61 | Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models |
| LAVIE (Zero-shot, 320x512) | 526.30 | LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models |
| PYoCo (Zero-shot, 64x64) | 355.19 | Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models |
| VideoPoet | 355 | VideoPoet: A Large Language Model for Zero-Shot Video Generation |
| Lumiere (Zero-shot, 1024x1024) | 332.49 | Lumiere: A Space-Time Diffusion Model for Video Generation |
| Snap Video (Zero-shot, 288×288) | 260.1 | Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis |
| W.A.L.T 3B | 258.1 | Photorealistic Video Generation with Diffusion Models |
| PixelDance (Zero-shot, 256x256) | 242.82 | Make Pixels Dance: High-Dynamic Video Generation |
| Snap Video (Zero-shot, 512x288) | 200.2 | Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis |
0 of 10 row(s) selected.