imgimg

Depth Anything v2

This is a deployment of the depth estimation model Depth Anything v2. It supports both video and image inputs. Depth Anything v2 takes in an input frame and estimates the relative depth of each pixel.

Links:

Features

  • High-Quality Depth Estimation: Produces accurate, high-resolution depth maps
  • Output Formats: Produce a colored visualization, a normalized depth map, or a depth map with the raw values from the model
  • Video Support: Compute depth maps for an entire video!

Parameters

  • input: The input image or video used to compute depth maps.
  • model_type: Which Depth Anything v2 model to use. Choose small, base, or large.
    • small: Fastest, lowest quality
    • base: middle ground between small and large
    • large: Slowest, highest quality. Best temporal consistency for videos
  • return_type: Depth map output format. Choose colormap, normalized, or raw.
    • colormap: Returns a colored depth map as a PNG image.
      • scale from red = closer to camera, blue = further away
    • normalized: Returns a normalized depth map as a greyscale PNG image (pixel values from 0 to 255).
      • higher values are closer to the camera
    • raw: Returns a relative depth map with the raw values from the model as a numpy (.npy) file.
      • higher values are closer to the camera
  • video_output_format: (video only) If zip, returns a zip file containing the output frames. If mp4, returns a video file.
    • output frames in a zip file will be numbered starting from 00000000.

Examples

SourceNormalizedColored
GIFGIFGIF
IMGIMGIMG

Pricing

For this model, we charge according to our compute-based pricing. See https://www.sievedata.com/pricing for more details.

This model is deployed on an L4 GPU instance, and is charged at a usage-based rate of $1.25 per hour.

Using Depth Anything v2 via the Sieve SDK

import os
os.environ["SIEVE_API_KEY"] = "I95L6C-1U3rPcEi07auqp0Zrozfob7d5RFG4htmPbZU"

import sieve

depth_anything_v2 = sieve.function.get("sieve-internal/depth-anything-v2")

# define the input as a sieve.File object
input_image = sieve.File(path="/home/azureuser/sample_inputs/karp.png")

# get a normalized depth map
out = depth_anything_v2.run(input_image, model_type="small", return_type="normalized")
output_normalized_file = out.path

# get a colored depth map
out = depth_anything_v2.run(input_image, model_type="small", return_type="colormap")
output_colormap_file = out.path

# get raw depths as a numpy .npy file
out = depth_anything_v2.run(input_image, model_type="small", return_type="raw")
output_raw_file = out.path
# load the raw depths as a numpy array
import numpy as np
depths = np.load(output_raw_file)

# also works on videos!
input_video = sieve.File(path="/home/azureuser/sample_inputs/mok_short.mp4")
out = depth_anything_v2.run(input_video, model_type="small", return_type="normalized", video_output_format="mp4")
output_normalized_video_file = out.path

# get a zip file of the output frames
out = depth_anything_v2.run(input_video, model_type="small", return_type="normalized", video_output_format="zip")
output_normalized_zip_file = out.path
import zipfile
with zipfile.ZipFile(output_normalized_zip_file, "r") as zip_ref:
    zip_ref.extractall("frame_depths")

License

The small version of the model is licensed by its creators under the Apache 2.0 license.

All other versions are under the CC-BY-NC-4.0 license.