👩🏼‍🏫 Invited talk on “Application and expansion of DiT architecture in video generation models” at Intelligent things

I have been invited to give a talk on “Application and Expansion of the DiT Architecture in Video Generation Models” at Intelligent Things.

Abstract

In this talk, I will share the current state and recent advances in video generation research. I will then introduce Latte, a Transformer-based video diffusion model. Following that, I will present some visual comparisons of generated videos. Finally, I will conclude with a discussion on potential future directions and task extensions in text-to-video generation.

Xin Ma
Xin Ma

I’m a Ph.D canditate at Monash University. My research interests include image super-resolution and inpainting, model compression, face recognition, video generation, large-scale generative models, etc