Blog

Our latest product updates and thoughts on state-of-the-art AI capabilities.

June 24, 2025

Dubbing 3.0: Cleaner, Smarter, and More Human AI Dubbing for Developers

by Ahmed Hanzala • 4 min read

We discuss the latest updates to Sieve's dubbing pipeline and how it offers the best multi-speaker handling, translation/voice quality, and emotion expression.

June 11, 2025

Introducing the Dubbing Rubric

by Ahmed Hanzala • 5 min read

A comprehensive rubric for evaluating AI dubbing systems across seven key categories

May 13, 2025

Creating a Viral AI Video Influencer Bot on X/Twitter

by Adi Panda • 7 min read

Learn how we made TBPNify, an AI twitter video influencer, that can automatically reply to tweets with realistic talking head videos!

April 25, 2025

Introducing Border Detection and Removal

by Gaurang Bharti • 2 min read

We discuss a newly launched solution on Sieve built to automatically detect and remove embedded borders from videos.

January 10, 2025

Bringing world-class lipsync to developers with sync.

by Mokshith Voodarla • 2 min read

We discuss our partnership with sync. to bring their new 1.9.0-beta model into the Sieve ecosystem.

January 7, 2025

Building an Automated Background and Caption Effects Pipeline

by Dikshant Shah • 23 min read

Learn how to build a production-ready pipeline that programmatically enhances videos with AI-powered background replacement and dynamic captions.

December 30, 2024

The Impact of YouTube Dubbing on Global Content

by Akshara Soman • 3 min read

YouTube’s auto-dubbing tool democratizes access to multilingual content creation. We discuss its current capabilities and limitations, and how businesses should strategize around it.

December 26, 2024

Safety Concerns of AI Dubbing and How Sieve Addresses Them

by Akshara Soman • 4 min read

Learn about the key safety concerns in AI dubbing, such as deepfake risks, voice identity theft, and bias along with how Sieve addresses them.

December 25, 2024

Comparing the best methods for OCR on videos

by Dikshant Shah • 18 min read

A comprehensive comparison of video OCR solutions using modern AI models like Gemini and Florence 2 versus traditional OCR approaches, with practical implementation guides and performance metrics.

December 24, 2024

Building a Comprehensive Video Translation Tool: Subtitles, Voices, Lipsync, and On-Screen Text

by Akshara Soman • 4 min read

A guide to generating subtitles, voices, lipsync, and translated on-screen text for video translation using Sieve.

December 16, 2024

Building a robust ball tracking system for sports with SAM 2

by Dikshant Shah • 15 min read

A comprehensive guide to implementing robust ball tracking in sports videos using SAM 2, with practical solutions for handling scene changes, false positives, and dynamic camera movements.

December 3, 2024

Introducing the Sieve Moderation Suite

by Ahmed Hanzala • 4 min read

We discuss a new suite of moderation pipelines available on Sieve designed for ease of use, customization, and cost-effectiveness.

November 29, 2024

Building the Fastest YouTube Video Summarizer in the World

by Akshara Soman • 5 min read

We discuss various approaches to building a high-performance YouTube video summarizer. Some take visual elements into account, while others focus on audio.

November 28, 2024

Transforming YouTube Videos into NotebookLM-like Conversational Avatars

by Akshara Soman • 5 min read

Learn how we built a pipeline to turn YouTube videos into conversational podcasts using various functions readily available on Sieve.

November 26, 2024

Enabling human-in-the-loop video and audio dubbing systems on Sieve

by Ahmed Hanzala • 2 min read

We introduce new features of the Sieve dubbing pipeline that enable human-in-the-loop experiences to be built on top.

November 23, 2024

Building an AI dubbing app for YouTube videos

by Mokshith Voodarla • 2 min read

We walk through building a simple app to download and dub YouTube videos.

November 19, 2024

Bringing world-class audio enhancement to developers with ai|coustics

by Mokshith Voodarla • 2 min read

We discuss the partnership between Sieve and ai|coustics to bring world-class audio enhancement to developers.

November 13, 2024

Introducing Vanish: the best way to remove backgrounds from video

by Jacob Marshall • 2 min read

We discuss a new pipeline for removing backgrounds from video that offers high-quality outputs on complex scenes as well as a fast option for simpler videos.

November 7, 2024

Introducing Portrait Avatars: generate talking head videos from images and audio

by Gaurang Bharti • 2 min read

We discuss a new pipeline for generating talking head videos from images and audio as well as partnerships with Hedra Labs and Infinity AI.

November 6, 2024

A developer’s guide to background noise removal in video and audio

by Ahmed Hanzala • 3 min read

A practical guide to removing background noise from videos using traditional signal processing, advanced AI models for noise suppression, and intelligent source separation methods.

November 5, 2024

Speaker Recognition Guide: How to Detect Speakers in Video and Audio

by Mokshith Voodarla • 2 min read

A guide to implementing speaker recognition in video and audio using diarization and active speaker detection techniques.

October 17, 2024

Dubbing 2.0: The highest quality AI dubbing solution for developers

by Ahmed Hanzala • 5 min read

We discuss the latest updates to Sieve's dubbing pipeline and how it offers the best speech quality, translation controls, and pricing for developers.

October 16, 2024

Kaiber partners with Sieve to launch Superstudio

by Mokshith Voodarla • 3 min read

We discuss Kaiber's launch of Superstudio and how they use Sieve's infrastructure to power their AI video workloads.

October 15, 2024

Eye Contact 1.0: eye gaze redirection for developers

by Gaurang Bharti • 2 min read

We discuss a new gaze redirection pipeline designed to make the eyes in talking head videos look directly at the camera.

September 19, 2024

Kapwing partners with Sieve to launch AI Personas

by Mokshith Voodarla • 3 min read

How Sieve powers Kapwing's new AI avatar tool - enabling creators to generate automatic talking head videos in a few clicks.

September 17, 2024

SieveSync: realistic, zero-shot lipsync for developers

by Ahmed Hanzala • 5 min read

We discuss a new zero-shot lipsync pipeline built with MuseTalk, LivePortrait, and CodeFormer designed to preserve more realism than existing solutions.

September 10, 2024

How Scenery approaches human-centric AI video understanding with Sieve

by Mokshith Voodarla • 2 min read

We discuss how Scenery uses Sieve to run human-centric video understanding workloads that power features like AI Shorts.

September 9, 2024

VEED partners with Sieve to launch VEED Clips

by Mokshith Voodarla • 3 min read

We discuss a partnership between VEED and Sieve to launch VEED Clips, a new AI-powered video clipping tool.

September 2, 2024

Exploring ways to text prompt SAM 2

by Lachlan Gray • 6 min read

SAM 2 can't natively take in text prompts. We discuss various ways to build pipelines around SAM 2 to accomplish text-prompted segmentation.

August 27, 2024

The fastest way to run Meta's SAM 2 (Segment Anything Model 2)

by Jacob Marshall • 6 min read

Learn about Meta's SAM 2 (Segment Anything Model 2) and how Sieve's optimized implementation runs 2x faster. Explore use cases, benchmarks, and how to use SAM 2.

August 20, 2024

MuseTalk: Real-Time High Quality Lip Synchronization with Latent Space Inpainting

by Gaurang Bharti • 4 min read

In this blog, we dive into MuseTalk, a state-of-the-art zero-shot lipsyncing model. We cover how it works, its pros and cons, and how to run it on Sieve.

July 30, 2024

Dubbing an entire Khan Academy course in 10 minutes

by Mokshith Voodarla • 5 min read

We walk through using the Sieve API to download and dub an entire Khan Academy course in under 10 minutes.

June 20, 2024

Introducing Sieve Dubbing 1.0: AI Dubbing for Developers

by Mokshith Voodarla • 4 min read

We discuss the launch of Sieve’s Dubbing API, the first AI dubbing solution purpose-built for developers.

May 3, 2024

Introducing Autocrop 1.0: Format videos into different aspect ratios with AI editing

by Mokshith Voodarla • 3 min read

We discuss the launch of Autocrop 1.0, a new API that allows you to format videos into different aspect ratios with AI editing.

April 16, 2024

Zight and Sieve: Using AI to build better video communication

by Mokshith Voodarla • 4 min read

We discuss the importance of AI in video communication and why Zight chose Sieve to power their new AI features.

April 3, 2024

Finding highlights in long-form video content automatically

by Gaurang Bharti • 4 min read

We do a deep dive into building an intricate algorithm on top of LLMs to accurately identify and extract highlights from long-form video content.

March 26, 2024

How developers are changing video creation once again with AI

by Mokshith Voodarla • 5 min read

We discuss the first time computers drastically changed video creation and how it’s changing once again because of new AI models.

March 15, 2024

Introducing Describe: Incredibly descriptive audiovisual summaries for videos

by Gaurang Bharti • 5 min read

We discuss the launch of Describe along with the challenges and approaches to generating audiovisual descriptions of videos.

March 13, 2024

Adding Sound Effects to Stock Videos with AI

by Mokshith Voodarla • 2 min read

In this post, we build an app that adds sound effects to stock videos using vision language models and audio generation models.

March 6, 2024

Introducing GPU sharing on Sieve

by Gaurav Rao • 4 min read

In this post, we discuss support for GPU sharing on Sieve and how it enables faster, more cost-effective AI models.

February 28, 2024

Fast, efficient active speaker detection on videos

by Mokshith Voodarla • 5 min read

In this post, we discuss active speaker detection as a deep learning task and how we built a solution that performs ~90% faster than other solutions.

December 11, 2023

Announcing the most cost-effective audio transcription API

by Mokshith Voodarla • 5 min read

In this post, we discuss the commoditization of audio transcription and a new Sieve offering around it that is 5x cheaper than other providers while still maintaining speed and accuracy.

November 22, 2023

Improving on open-source for fast, high-quality AI lipsyncing

by Abhinav Ayalur • 3 min read

We discuss current lip syncing solutions such as OpenRetalker’s Video Retalking and SieveSync to get a performant, production-ready lipsyncing solution.

October 19, 2023

State of the art audio enhancement in 5 minutes

by Abhi Upadhyay • 4 min read

Learn how to leverage an AI audio enhancement app with open-source for your vlogs and other media, rivaling the best APIs in the market. Try it for yourself!

March 7, 2023

Automatically generating video chapter titles with AI

by Abhi Upadhyay • 4 min read

In this blog post, we go through the process of generating video chapter titles with OpenAI's Whisper + GPT-3 models and an open-source text segmentation technique!

February 28, 2023

Building realistic video AI avatars in an hour from scratch

by Akshara Soman • 4 min read

Learn about the specialized pipelines in the Sieve toolkit for creating realistic AI avatars, including Portrait Avatar, LivePortrait, and Lipsync. This blog provides a detailed discussion of strengths, limitations, and use cases.

November 14, 2022

Sieve's Video AI API Beta and ~$4M Raise

by Mokshith Voodarla • 2 min read

The explosion of rich data, the Sieve public beta, our ~$4M seed round, and how we enable developers to build amazing experiences with video + AI.