AI-powered dubbing is transforming content localization by delivering faster completion times and cost-efficient solutions. By combining advanced tools with a human-in-the-loop approach, media companies can create a customized dubbing studio optimized for their unique production needs. This blog demonstrates how to leverage Sieve's dubbing pipeline and its human-in-loop features to create a streamlined dubbing editor.
Key Human-in-the-Loop Features for Dubbing
1. Multiple Output Modes
The translation-only
output mode provides translated text without generating audio. This feature is ideal for reviewing or editing translations before finalizing the dubbing. Use the output_mode
parameter to switch between:
translation-only
: Text translations onlyvoice-dubbing
: Fully dubbed output with spoken translations
2. Editable Segments with edit_segments
The edit_segments
parameter enables selective editing of specific media portions. This is particularly useful for:
- Fixing translations in pre-dubbed videos
- Dubbing specific segments while keeping others intact
- Adding custom translations for selected segments
The parameter accepts a list of segment objects with the following structure:
[
{
"start": 0,
"end": 10,
"translation": "Hello, how are you?"
}
]
Additional Features for Media Companies
- Speaker Style Preservation: Maintain natural voice quality and tone during dubbing using Eleven Labs voice cloning TTS engines for an authentic experience.
- Multi-Speaker Support: Handle videos with multiple speakers by assigning distinct voices to each speaker—perfect for movies or interviews.
- Scalable Translations: Translate into 29 languages simultaneously, making it easier to localize content for global audiences.
- Background Audio Retention: Preserve the original background audio in dubbed content for seamless, natural-sounding results. Background scores are essential for conveying emotions in movies.
- Safe Words: Specify words you don't want to translate, such as names, places, or specific terms, ensuring consistency across all outputs.
Multi-Step Dubbing Workflow
Using the features outlined above, we can create a professional-quality dubbing studio with the Sieve Dubbing pipeline in a two-step process:
1. Translation Preview
Use translation-only
mode to generate and review translations. Make necessary edits to ensure accuracy.
import sieve
source_file = sieve.File(url="https://storage.googleapis.com/sieve-prod-us-central1-public-file-upload-bucket/99d82ab9-7214-47b3-98f3-05367c2180dc/5ead25ea-a1b5-42be-909e-d4ca179fe9ed-input-source_file.mp4")
target_language = "hindi"
translation_engine = "gpt4"
voice_engine = "elevenlabs (voice cloning)"
transcription_engine = "whisper-zero"
output_mode = "translation-only"
safewords = "Missy, Sheldon, Mary"
dubbing = sieve.function.get("sieve/dubbing")
output = dubbing.run(source_file, target_language, translation_engine,
voice_engine, transcription_engine, output_mode,
safewords=safewords)
2. Final Dubbing
Feed the edited translations back using the edit_segments
parameter and select voice-dubbing
as the output mode. Ensure you provide translations for the entire duration of the media, not just the edited portions.
source_file = sieve.File(url="https://sieve-prod-us-central1-persistent-bucket.storage.googleapis.com/0a27f1ed-b241-4a1e-8b3c-e8aff3b8379c/cae53169-5ec9-4ed9-88c1-fc5e3859dccf/f770f4a8-e86b-46f0-ac15-fb1b4d1dc9bb/tmpx6upc6mh.mp4")
target_language = "hindi"
output_mode = "voice-dubbing"
edit_segments = output # specify in same format shown in previous section
preserve_background_audio = True
safewords = "Missy, Sheldon, Mary"
enable_lipsyncing = True
lipsync_backend = "sievesync"
dubbing = sieve.function.get("sieve/dubbing")
output = dubbing.run(source_file, target_language, output_mode=output_mode,
edit_segments=edit_segments,
preserve_background_audio=preserve_background_audio,
safewords=safewords,
enable_lipsyncing=enable_lipsyncing,
lipsync_backend=lipsync_backend)
for output_object in output:
print(output_object)
Real-World Applications
Here are examples of web-series and movie clips dubbed using the above pipeline. The original background score was preserved in the dubbed videos, and character names were specified as safe words to prevent translation.
Hindi-dubbed version of a clip from the web series Young Sheldon.
Mandarin-dubbed version of a clip from the movie The Intern.
Conclusion
Building an AI-powered dubbing studio with Sieve's dubbing pipeline enables media companies to streamline localization workflows. By leveraging features like human-in-the-loop translation, multi-speaker support, and background audio integration, you can efficiently produce high-quality dubbed content.
For personalized support or to book a demo, email us at contact@sievedata.com.