AMCC 6500G: Special Topics on Video Generation

Course Outline — Spring 2026

Course Information

Term: Spring 2026
Time: Friday 13:30 – 16:20
Location: RM 1409
Instructor: Prof. Chao (Harry) Yang
Teaching Assistant: Han Zhu ([email protected])
Contact: [email protected]

Course Description

This advanced research seminar treats video generation not merely as media synthesis, but as the foundation for General World Models and Embodied Agents. We will move beyond standard diffusion to explore the frontiers of Flow Matching and Autoregressive Visual Transformers. The curriculum emphasizes the convergence of video with robotics and interaction: specifically Vision-Language-Action (VLA) models, drivable 3D avatars, and neural simulators. Students will investigate how large-scale video pre-training enables "Playable Worlds," where AI agents can perceive, predict, and act within consistent, generated 4D environments.

Course Format & Tools

Format: Lecture + Presentation + Projects
Tools: PyTorch, Diffusers, ComfyUI, 3D Gaussian Splatting, Horizon/World Model codebases, MuJoCo

Weekly Schedule

Week 1: The New Thesis — Video Models as World Models

Slides