Tutorial
How to detect duplicate videos using an API
Duplicate-video detection isn't a hash lookup — re-encoding, cropping, and partial reuse defeat exact-match approaches. This walkthrough shows how to call POST /video/match with two URLs and turn the response into an actionable match decision.
Why exact-match hashing fails on video
If you're comparing two video files byte-for-byte (or with an MD5 / SHA hash), any re-encode breaks the match. Drop the bitrate, change the container, transcode to a different codec — the hash flips, even though the human-perceptible content is unchanged.
Real-world duplicate detection has to handle: re-encoded copies (e.g., H.264 → H.265 or different CRF), aspect-ratio changes (16:9 → 9:16 with letterboxing), trimmed clips (a 15-second segment of a 60-second source), watermark adds and removes, and audio swaps. Hash-based approaches can't represent any of those.
Perceptual matching uses content-derived fingerprints — frame-level features for video, robust to encoding, plus alignment so partial overlap is detectable.
What POST /video/match returns
The MediaLayer video endpoint takes two URLs, fetches both server-side, and returns a single JSON envelope. The shape is the same one used by /image/match and /audio/match, so call sites stay uniform across media types.
- match — Boolean decision based on a sensible default threshold for video.
- similarity_score — Float in [0, 1]. Use this directly when you need a custom threshold (e.g., a stricter cutoff for auto-block lanes).
- matched_segments — Array of aligned start/end timestamps in seconds — the source-to-target alignment of overlapping content. Empty for image, populated for video and audio.
{
"match": true,
"confidence": "high",
"similarity_score": 0.91,
"processing_time_ms": 1840,
"media_type": "video",
"matched_segments": [
{ "source_start": 0.0, "source_end": 14.8, "target_start": 2.3, "target_end": 17.1, "score": 0.93 },
{ "source_start": 30.5, "source_end": 38.2, "target_start": 60.0, "target_end": 67.8, "score": 0.87 }
]
}Calling the endpoint
Authentication is via RapidAPI: subscribe to the MediaLayer listing, copy your x-rapidapi-key, and pass three headers on every call.
curl -X POST https://medialayer-image-audio-video-matching-api.p.rapidapi.com/video/match \
-H "x-rapidapi-key: YOUR_RAPIDAPI_KEY" \
-H "x-rapidapi-host: medialayer-image-audio-video-matching-api.p.rapidapi.com" \
-H "Content-Type: application/json" \
-d '{
"source_url": "https://example.com/source.mp4",
"target_url": "https://example.com/target.mp4"
}'Python — sync with requests
Drop-in pattern for a backend that processes uploads one at a time. For a high-throughput pipeline, use httpx with an async client and bound concurrency to your RapidAPI plan's per-second limit.
import requests
def is_duplicate_video(source_url: str, target_url: str, threshold: float = 0.85) -> bool:
headers = {
"x-rapidapi-key": "YOUR_RAPIDAPI_KEY",
"x-rapidapi-host": "medialayer-image-audio-video-matching-api.p.rapidapi.com",
"Content-Type": "application/json",
}
payload = {"source_url": source_url, "target_url": target_url}
r = requests.post(
"https://medialayer-image-audio-video-matching-api.p.rapidapi.com/video/match",
json=payload,
headers=headers,
timeout=60,
)
r.raise_for_status()
data = r.json()
return data["similarity_score"] >= thresholdActing on matched_segments
The boolean match is fine for binary block / pass decisions, but real workflows usually want to act on overlap duration. Sum the duration of matched_segments to get total overlapping seconds, and compare that against your policy threshold.
For ownership and monetization workflows, the same calculation drives the share-revenue / hold / takedown lane. For trust-and-safety workflows, it's the difference between actioning a 1-second incidental match and a 45-second full-clip lift.
function totalOverlapSeconds(
segments: { source_start: number; source_end: number }[],
): number {
return segments.reduce(
(acc, s) => acc + (s.source_end - s.source_start),
0,
);
}
// Use overlap duration, not just the boolean match flag.
const overlap = totalOverlapSeconds(response.matched_segments);
if (overlap >= 30) routeTo("review");
else if (overlap >= 5) routeTo("monitor");
else routeTo("pass");Production checklist
- Keep keys server-side — Never embed x-rapidapi-key in browser or mobile clients. Proxy from your backend and inject the header there.
- Bound timeouts — Video matching can take a few seconds for long clips. Pick a per-call timeout that matches your queue's SLA (60s is a sane upper bound).
- Use public URLs — URL validation rejects private, loopback, and cloud-metadata addresses. Make sure source_url and target_url are publicly reachable.
- Move to one-to-many — If you're doing N-vs-N comparisons against a growing reference catalog, the public two-URL API will get expensive fast. Switch to Enterprise media search.
Endpoints used in this article
POST /video/match
Compare two videos and return aligned matched segments with similarity scores. Built to survive re-encoding, cropping, and partial reuse.
See full reference →POST /image/match
If you only need to match a single representative frame, the image endpoint is faster and uses the same envelope.
See full reference →Related articles
Near-duplicate media detection: image, video, and audio in one API
Why near-duplicate detection is harder than exact-match hashing — and how the same envelope handles all three media types.
Read article →Audio fingerprinting API explained
What fingerprinting actually does and why hashes don't work for transcoded audio.
Read article →Keep exploring
Video playground
Try /video/match in your browser before wiring it into your pipeline.
Open →Copyright / reuse detection
How aligned matched segments drive ownership and monetization workflows.
Open →Enterprise media search
One-to-many matching against indexed catalogs — the right surface when pairwise comparisons stop scaling.
Open →Ready to wire it in?
Subscribe on RapidAPI to call the public API on your own key, or talk to MediaLayer AI Labs about enterprise direct API access.