No Examples Found
Lipsync
A comprehensive solution for video lipsyncing with a suite of different models and enhancement options.
Available backends include:
-
SieveSync: This backend uses a proprietary alignment technique with optimized MuseTalk and LivePortrait for faster inference and better sync with the audio. Videos without many motion/scene cuts work best with this backend.
-
MuseTalk: This backend uses the MuseTalk model combined with CodeFormer (optional but recommended) to sync the lips in the driver video/image with the provided audio and restore the face.
-
Video Retalking: This backend uses the Video Retalking model combined with GPEN and GFPGAN to sync the lips in the driver video/image with the provided audio.
For pricing, click here.
For examples, click here.
For tips to ensure better performance, click here.
Ethical Considerations
Lipsync technologies come with social risks, particularly the potential for misuse in creating deepfakes. To mitigate these risks, it’s crucial to follow ethical guidelines and adopt responsible usage practices. Currently, the synthesized results contain visual artifacts that may help in detecting deepfakes as well as watermarks that identify the use of Sieve. Please note that we do not assume any legal responsibility for the use of the results generated by this app.
Please reach out to us at sales@sievedata.com or via Discord if you have any questions or concerns or if you want to request a watermark removal.
Important Notes:
- Enhance applies restoration to the face only and does not affect the resolution of the video.
- The processing time depends on video resolution and video length along with the amount of time a valid speaker is detected.
- SieveSync is a custom backend that combines the best of both worlds, running at 25 FPS with high face fidelity and good lip movement.
- MuseTalk is preferred for better overall face fidelity, and Video Retalking for better lip movement and resolution. MuseTalk runs @ 25 FPS whereas Video Retalking can handle higher FPS.
Tips for better performance:
- Ensure there is only a single primary speaker in the video
- Ensure the person is facing the camera
- Ensure the person is not wearing any accessories that cover the mouth (e.g. mask, scarf, etc.)
- Ensure the person is not moving their head too much
- Ensure the person's face is not very small in the frame
- The MuseTalk and SieveSync backends may perform unreliably in case the person has a lot of facial hair
- Downsampling to 720p can help decrease processing times and artifacts in unstable videos which can be enabled by setting
downsample
totrue
Information on the cut_by
parameter:
- The duration of the audio file always supersedes the duration of the video file.
- When
audio
is selected as the input and the video is shorter than the audio, the video is played until the end then played backward to the start, and so on until it meets the duration of the audio. - When
video
is selected as the input and the video is shorter than the audio, the audio is cut off when the video ends. - When
shortest
is selected, the file with the shorter duration between the two decides the duration, and the files are cut off accordingly.
Pricing
Backend | Enhance | Price per Minute |
---|---|---|
SieveSync | True | $0.50 |
False | $0.35 | |
MuseTalk | True | $0.35 |
False | $0.20 | |
Video Retalking | True | $0.45 |
False | $0.30 |
Notes:
- Any content above 1080p will be downsampled to 1080p
- The "Enhance" option applies additional processing for improved quality
- Prices are subject to change. Please refer to our latest documentation for the most up-to-date pricing information.
Examples
Works best on a computer or in landscape
Driving Video | Driving Audio | Output | Backend | Enhance | Price | Sieve Job |
SieveSync | True | $1.42 | Here | |||
SieveSync | False | $0.1 | Here | |||
MuseTalk | True | $0.122 | Here | |||
Video Retalking | True | $0.07 | Here |