|En
Vchitect 2.0Embark on a Visual Fantasy Journey20-second video generation, flexible aspect ratios, generative space-time enhancement, long video evaluation
Technical ReportGitHub CodeTraining Framework Try It
Text-to-Video
Vchitect2.0, developed by the Shanghai AI Lab, is an advanced video generation model designed to empower video creation.
5 to 20 seconds video generation
Flexible aspect ratios, allowing users to generate videos of arbitrary aspect ratios.
High-definition quality and integrated super-resolution and frame insertion, featuring user-adjustable content correction capabilities.
Image-to-Video
Transform static images into 5-10 second videos. It enables users to effortlessly convert photos or designs into captivating visual experiences.
Video Generation Benchmark -- VBench
VBench now supports evaluating long video generative models.
Comprehensive and Continuously Updated Evaluation Leaderboard
Covering 28 text-to-video models and 12 image-to-video models.
Open Source, One-click Evaluation Deployment
Recognized as the industry standard for automated video generation evaluation and extensively reported by media outlets like the South China Morning Post. The evaluation suite has been adopted by major video generation models, enhancing evaluation consistency and transparency for video generation.
Supports Multiple Long Video Models
VBench upgraded its evaluation suite to support mainstream long video generation models, now including support for models like Gen-3, Kling, OpenSora, etc.
Model
Total Score
Quality Score
Semantic Score
Subject ConsistencyBackground ConsistencyTemporal FlickeringMotion SmoothnessDynamic DegreeAesthetic QualityImaging QualityObject ClassMultiple ObjectsHuman ActionColorSpatial RelationshipSceneAppearance StyleTemporal StyleOverall Consistency
Gen-3
82.32%
84.11%
75.17%
97.10%
96.62%
98.61%
99.23%
60.14%
63.34%
66.82%
87.81%
53.64%
96.40%
80.90%
65.09%
54.57%
24.31%
24.71%
26.69%
Kling (2024-07 high-performance mode)
81.85%
83.39%
75.68%
98.33%
97.60%
99.30%
99.40%
46.94%
61.21%
65.62%
87.24%
68.05%
93.40%
89.90%
73.03%
50.86%
19.62%
24.17%
26.42%
CogVideoX-5B-SAT (SAT prompt-optimized)
81.61%
82.75%
77.04%
96.23%
96.52%
98.66%
96.92%
70.97%
61.98%
62.90%
85.23%
62.11%
99.40%
82.81%
66.35%
53.20%
24.91%
25.38%
27.59%
Vchitect 2.0-2B
81.57%
82.51%
77.79%
96.42%
96.53%
98.45%
97.76%
58.33%
61.47%
65.60%
87.81%
69.35%
97.00%
86.87%
54.64%
57.51%
24.93%
25.56%
28.01%
CogVideoX-2B-SAT (SAT prompt-optimized)
80.91%
82.18%
75.83%
96.78%
96.63%
98.89%
97.73%
59.86%
60.82%
61.68%
83.37%
62.63%
98.00%
79.41%
69.90%
51.14%
24.80%
24.36%
26.66%
OpenSorav1-2 (8s)
79.76%
81.35%
73.39%
96.75%
97.61%
99.53%
98.50%
42.39%
56.85%
63.34%
82.22%
51.83%
91.20%
90.08%
68.56%
42.44%
23.95%
24.54%
26.85%
OpenSoraPlanv1-1
78.00%
80.91%
66.38%
95.73%
96.73%
99.03%
98.28%
47.72%
56.85%
62.28%
76.30%
40.35%
86.80%
89.19%
53.11%
27.17%
22.90%
23.87%
26.52%
OpenSorav1-1
75.66%
77.75%
67.35%
92.35%
97.52%
98.31%
92.78%
82.14%
50.12%
54.90%
86.76%
40.97%
84.20%
74.56%
52.47%
38.63%
23.50%
23.86%
26.37%
Mira384
71.87%
78.83%
44.21%
96.23%
96.92%
98.29%
97.54%
60.33%
42.51%
60.16%
52.06%
12.52%
63.80%
42.24%
27.83%
16.34%
21.89%
18.77%
18.72%
All Rights Reserved. Record No. 2021009351-21 of Shanghai ICP Filing
Contact Us
vchitect@pjlab.org.cn