Yi Wang,
Yinan He,
Yizhuo Li,
Kunchang Li,
Jiashuo Yu,
Xin Ma,
Xinhao Li,
Guo Chen,
Xinyuan Chen,
Yaohui Wang,
Conghui He,
Ping Luo,
Ziwei Liu,
Yali Wang,
Limin Wang,
Yu Qiao
(2024).
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation.
In
ICLR.
Yaohui Wang,
Xinyuan Chen,
Xin Ma,
Shangchen Zhou,
Ziqi Huang,
Yi Wang,
Ceyuan Yang,
Yinan He,
Jiashuo Yu,
Peiqing Yang,
Yuwei Guo,
Tianxing Wu,
Chenyang Si,
Yuming Jiang,
Cunjian Chen,
Chen Change Loy,
Bo Dai,
Dahua Lin,
Yu Qiao,
Ziwei Liu
(2023).
LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models.