Video Prediction On Kinetics 600 12 Frames
Metrics
Cond
FVD
Pred
Results
Performance results of various models on this benchmark
Comparison Table
Model Name | Cond | FVD | Pred |
---|---|---|---|
scaling-autoregressive-video-models | 5 | 170±5 | 11 |
omnitokenizer-a-joint-image-video-tokenizer | - | 32.9 | - |
larp-tokenizing-videos-with-a-learned-1 | 5 | 5.1 | 11 |
efficient-video-generation-on-complex | 5 | 69.15±0.78 | 11 |
latent-video-transformer | 5 | 224.73 | 11 |
scalable-adaptive-computation-for-iterative | - | 10.8 | - |
transformation-based-adversarial-video | 5 | 25.74±0.66 | 11 |
magvit-masked-generative-video-transformer | 5 | 9.9±0.3 | 11 |
language-model-beats-diffusion-tokenizer-is | - | 4.3±0.1 | - |
ccvs-context-aware-controllable-video | 5 | 55±1 | 11 |
magvit-masked-generative-video-transformer | 5 | 24.5±0.9 | 11 |
photorealistic-video-generation-with | - | 3.3 | - |
predicting-video-with-vqvae-1 | 4 | 64.30±2.04 | 12 |
scalable-adaptive-computation-for-iterative | - | 11.5 | - |
diffusion-models-for-video-prediction-and | 5 | 16.46 | 11 |