Tutorial

How to detect duplicate videos using an API

April 8, 2026Tutorials5 min read

Duplicate-video detection isn't a hash lookup — re-encoding, cropping, and partial reuse defeat exact-match approaches. This walkthrough shows how to call POST /video/match with two URLs and turn the response into an actionable match decision.

Why exact-match hashing fails on video

If you're comparing two video files byte-for-byte (or with an MD5 / SHA hash), any re-encode breaks the match. Drop the bitrate, change the container, transcode to a different codec — the hash flips, even though the human-perceptible content is unchanged.

Real-world duplicate detection has to handle: re-encoded copies (e.g., H.264 → H.265 or different CRF), aspect-ratio changes (16:9 → 9:16 with letterboxing), trimmed clips (a 15-second segment of a 60-second source), watermark adds and removes, and audio swaps. Hash-based approaches can't represent any of those.

Perceptual matching uses content-derived fingerprints — frame-level features for video, robust to encoding, plus alignment so partial overlap is detectable.

What POST /video/match returns

The MediaLayer video endpoint takes two URLs, fetches both server-side, and returns a single JSON envelope. The shape is the same one used by /image/match and /audio/match, so call sites stay uniform across media types.

match — Boolean decision based on a sensible default threshold for video.
similarity_score — Float in [0, 1]. Use this directly when you need a custom threshold (e.g., a stricter cutoff for auto-block lanes).
matched_segments — Array of aligned start/end timestamps in seconds — the source-to-target alignment of overlapping content. Empty for image, populated for video and audio.

RESPONSE · /VIDEO/MATCH

{
  "match": true,
  "confidence": "high",
  "similarity_score": 0.91,
  "processing_time_ms": 1840,
  "media_type": "video",
  "matched_segments": [
    { "source_start": 0.0, "source_end": 14.8, "target_start": 2.3, "target_end": 17.1, "score": 0.93 },
    { "source_start": 30.5, "source_end": 38.2, "target_start": 60.0, "target_end": 67.8, "score": 0.87 }
  ]
}

Calling the endpoint

Authentication is via RapidAPI: subscribe to the MediaLayer listing, copy your x-rapidapi-key, and pass three headers on every call.

CURL

curl -X POST https://medialayer-image-audio-video-matching-api.p.rapidapi.com/video/match \
  -H "x-rapidapi-key: YOUR_RAPIDAPI_KEY" \
  -H "x-rapidapi-host: medialayer-image-audio-video-matching-api.p.rapidapi.com" \
  -H "Content-Type: application/json" \
  -d '{
    "source_url": "https://example.com/source.mp4",
    "target_url": "https://example.com/target.mp4"
  }'

Python — sync with requests

Drop-in pattern for a backend that processes uploads one at a time. For a high-throughput pipeline, use httpx with an async client and bound concurrency to your RapidAPI plan's per-second limit.

PYTHON · REQUESTS

import requests

def is_duplicate_video(source_url: str, target_url: str, threshold: float = 0.85) -> bool:
    headers = {
        "x-rapidapi-key": "YOUR_RAPIDAPI_KEY",
        "x-rapidapi-host": "medialayer-image-audio-video-matching-api.p.rapidapi.com",
        "Content-Type": "application/json",
    }
    payload = {"source_url": source_url, "target_url": target_url}

    r = requests.post(
        "https://medialayer-image-audio-video-matching-api.p.rapidapi.com/video/match",
        json=payload,
        headers=headers,
        timeout=60,
    )
    r.raise_for_status()
    data = r.json()
    return data["similarity_score"] >= threshold

Acting on matched_segments

The boolean match is fine for binary block / pass decisions, but real workflows usually want to act on overlap duration. Sum the duration of matched_segments to get total overlapping seconds, and compare that against your policy threshold.

For ownership and monetization workflows, the same calculation drives the share-revenue / hold / takedown lane. For trust-and-safety workflows, it's the difference between actioning a 1-second incidental match and a 45-second full-clip lift.

TYPESCRIPT

function totalOverlapSeconds(
  segments: { source_start: number; source_end: number }[],
): number {
  return segments.reduce(
    (acc, s) => acc + (s.source_end - s.source_start),
    0,
  );
}

// Use overlap duration, not just the boolean match flag.
const overlap = totalOverlapSeconds(response.matched_segments);
if (overlap >= 30) routeTo("review");
else if (overlap >= 5) routeTo("monitor");
else routeTo("pass");

Production checklist

Keep keys server-side — Never embed x-rapidapi-key in browser or mobile clients. Proxy from your backend and inject the header there.
Bound timeouts — Video matching can take a few seconds for long clips. Pick a per-call timeout that matches your queue's SLA (60s is a sane upper bound).
Use public URLs — URL validation rejects private, loopback, and cloud-metadata addresses. Make sure source_url and target_url are publicly reachable.
Move to one-to-many — If you're doing N-vs-N comparisons against a growing reference catalog, the public two-URL API will get expensive fast. Switch to Enterprise media search.

Endpoints used in this article

POST /video/match

Compare two videos and return aligned matched segments with similarity scores. Built to survive re-encoding, cropping, and partial reuse.

See full reference →

POST /image/match

If you only need to match a single representative frame, the image endpoint is faster and uses the same envelope.

See full reference →