Introducing Vanish: the best way to remove backgrounds from video
We discuss a new pipeline for removing backgrounds from video that offers high-quality outputs on complex scenes as well as a fast option for simpler videos.
/blog-assets/authors/jacob.jpg
by Jacob Marshall
Cover Image for Introducing Vanish: the best way to remove backgrounds from video

Today we’re excited to introduce Vanish (and little sibling Parallax), the best way for developers to remove backgrounds from video. Background removal is a common task in media processing but there are no quality, developer-friendly solutions available for video. sieve/background-removal is a pipeline that combines SAM 2 and proprietary models to offer options for both fast and high-quality background removal to developers.

You can try the pipeline and integrate it via API here.

Evaluating existing solutions like BiRefNet-Lite

Most existing solutions try to apply image models to video which poses issues in consistency and quality, especially in high movement scenarios. Below is an example of running BiRefNet-Lite via rembg.

Vanish: High-quality video background removal powered by SAM 2

Vanish is built on top of SAM 2 and a guiding foreground model, which together result in extremely high-quality masks.

Segment Anything 2 (SAM 2) from Meta is an open-source model for object-tracking and segmentation. We adapted it into an automatic background removal tool by tackling key challenges, primarily SAM 2’s need for user prompts. Since background removal should be entirely automatic, we devised an approach to auto-select relevant objects in images, aiming to outperform existing image-to-mask solutions.

Our solution combines SAM 2 with a traditional foreground model (BiRefNet Lite), using a hybrid pipeline. By starting with a foreground mask as a prompt, SAM 2 is iteratively guided to track specific objects across frames. To account for objects appearing in later frames, we use a comparison between tracked objects and the foreground mask; significant differences prompt SAM 2 to track new objects. This method, refined by filtering out noise from the foreground model, isolates main subjects, resulting in highly effective background removal across video frames. We plan to go into more detail about this approach in a followup technical blog.

Parallax: Fast, consistent background removal

While some use cases deal with highly complex scenes, we also heard the need for speed from developers working with simpler videos like talking heads. To this end we’re introducing Parallax, a backend option that runs at 30 FPS while producing high quality outputs on simpler videos. You can try it by setting backend to parallax in the sieve/background-removal pipeline.

Conclusion

Background removal is becoming a common part of creative tools, data annotation pipelines, and many other media processing applications. To get started with Sieve’s pipeline, sign up for account here and get $20 in free credit. We’re excited to see what you build.