Accurately Lip-syncing Educational videos


Interactive Demo

Or choose from the example pairs below!

Note: If you do not get back a video result, it most likely means that the face detector could not detect faces in all the input video frames. This can sometimes happen for animated movie clips. It may take some time (not more than a minute usually) to generate the results! All results are currently limited to (utmost) 480p resolution and will be cropped to max. 20s to minimize compute latency. This interactive site is only an user-friendly demonstration of the bare minimum capabilities of the Wav2Lip model.


Disclaimer

All results from this demo website or the open-source code should only be used for research/academic/personal purposes only. As the models are trained on the LRS2 dataset, any form of commercial use is strictly prohibhited. Please contact us for further queries.

Ethical use

To ensure fair use, we strongly require that any result created using this site or our code must unambiguously present itself as synthetic and that it is generated using the Wav2Lip model. In addition, to the strong positive applications of this work, our intention to completely open-source our work is that it can simultaneously also encourage efforts in detecting manipulated video content and their misuse. We believe that Wav2Lip can enable several positive applications and also encourage productive discussions and research efforts regarding fair use of synthetic content.