Publication Date
Spring 2020
Degree Type
Master's Project
Degree Name
Master of Science (MS)
Department
Computer Science
First Advisor
Chris Pollett
Second Advisor
Leonard Wesley
Third Advisor
Philip Heller
Keywords
Generative Adversarial Network (GAN), video generation, StyleGAN, 3D convolutions
Abstract
Generative models have shown impressive results in generating synthetic images. However, video synthesis is still difficult to achieve, even for these generative models. The best videos that generative models can currently create are a few seconds long, distorted, and low resolution. For this project, I propose and implement a model to synthesize videos at 1024x1024x32 resolution that include human facial expressions by using static images generated from a Generative Adversarial Network trained on the human facial images. To the best of my knowledge, this is the first work that generates realistic videos that are larger than 256x256 resolution from single starting images. This model improves the video synthesis in both quantitative and qualitative ways compared to two state-of-the-art models: TGAN and MocoGAN. In a quantitative comparison, this project reaches a best Average Content Distance (ACD) score of 0.167, as compared to 0.305 and 0.201 of TGAN and MocoGAN, respectively.
Recommended Citation
Zhang, Lei, "Video Synthesis from the StyleGAN Latent Space" (2020). Master's Projects. 924.
DOI: https://doi.org/10.31979/etd.ywry-3qps
https://scholarworks.sjsu.edu/etd_projects/924
Included in
Artificial Intelligence and Robotics Commons, Graphics and Human Computer Interfaces Commons