MediaLayer

Concept

Near-duplicate media detection: image, video, and audio in one API

Concepts7 min read

Exact-match hashing is the easy half — it's the near-duplicates that consume review queues. This article covers what 'near-duplicate' actually means across image, video, and audio, and how to handle all three with a single JSON envelope.

Exact-match vs. near-duplicate

An exact-match check is a hash lookup: same bytes, same hash, match. It's fast, deterministic, and useful for catching the laziest re-uploads.

A near-duplicate is content that any human would call the same — but the bytes don't match. Re-encoded video, watermarked image, transcoded audio, cropped frame, mirrored asset, recompressed JPEG. Hashes can't represent any of those.

The cost of missing near-duplicates is concrete: re-uploaded harmful media, recycled ad creatives that re-enter brand-safety queues, copied marketplace listings, and rights-violating clips that slip past dedupe.

What survives, by media type

  • ImageRe-encoding (JPEG ↔ PNG ↔ WebP), bitrate / quality drops, resizing, cropping, watermarking, mirroring, light color edits.
  • VideoCodec changes, container swaps, aspect-ratio changes (with letterboxing), trimmed clips of a longer source, and audio swaps. matched_segments returns the aligned overlap.
  • AudioCodec changes, bitrate drops, sample-rate conversion, light EQ, modest pitch / time-stretching, and partial reuse. matched_segments returns offset-aligned overlap.

One envelope, three endpoints

MediaLayer exposes the same request shape on /image/match, /video/match, and /audio/match: two URLs in, structured envelope out. The response shape is uniform; only matched_segments changes (empty for image, populated for video and audio).

REQUEST BODY
{
  "source_url": "https://example.com/source.{jpg|mp4|mp3}",
  "target_url": "https://example.com/target.{jpg|mp4|mp3}"
}

Reading the response

match is the convenience boolean — sensible defaults per medium. similarity_score is the source of truth; pick a threshold that matches your false-positive tolerance. matched_segments is where near-duplicate detection earns its keep: aligned start/end timestamps in seconds, so review tools can show reviewers exactly which seconds of the source map to which seconds of the target.

RESPONSE · VIDEO
{
  "match": true,
  "confidence": "high",
  "similarity_score": 0.91,
  "processing_time_ms": 1840,
  "media_type": "video",
  "matched_segments": [
    { "source_start": 0.0, "source_end": 14.8, "target_start": 2.3, "target_end": 17.1, "score": 0.93 }
  ]
}

Same call, three media types

A typical pattern: a single dedupe service that dispatches to the right endpoint based on the media type extracted from the URL. The call site is uniform; only the path changes.

PYTHON · DISPATCHER
import requests

API_HOST = "medialayer-image-audio-video-matching-api.p.rapidapi.com"
HEADERS = {
    "x-rapidapi-key": "YOUR_RAPIDAPI_KEY",
    "x-rapidapi-host": API_HOST,
    "Content-Type": "application/json",
}

EXT_TO_PATH = {
    ".jpg": "/image/match", ".jpeg": "/image/match", ".png": "/image/match",
    ".webp": "/image/match", ".gif": "/image/match",
    ".mp4": "/video/match", ".mov": "/video/match", ".webm": "/video/match",
    ".mp3": "/audio/match", ".wav": "/audio/match", ".aac": "/audio/match",
    ".m4a": "/audio/match", ".flac": "/audio/match", ".ogg": "/audio/match",
}

def match(source_url: str, target_url: str) -> dict:
    ext = "." + source_url.rsplit(".", 1)[-1].lower()
    path = EXT_TO_PATH.get(ext)
    if not path:
        raise ValueError(f"Unsupported extension: {ext}")
    r = requests.post(
        f"https://{API_HOST}{path}",
        json={"source_url": source_url, "target_url": target_url},
        headers=HEADERS,
        timeout=60,
    )
    r.raise_for_status()
    return r.json()

Choosing thresholds

There isn't one threshold that works across every domain. T&S and copyright workflows need a higher score (you're acting on someone's content). Marketplace dedupe can tolerate lower scores because near-duplicates still warrant review.

Two patterns work well in production. First: use similarity_score with a per-vertical threshold and route into review / monitor / pass lanes. Second: combine score with matched-segment duration so 'high score, 1-second overlap' doesn't auto-block a real-life incidental match.

When to graduate to one-to-many

The two-URL endpoints are great for pairwise comparisons against a small reference catalog. Past a few thousand reference assets, pairwise calls become wasteful — you're doing N comparisons to find a top-K match.

That's where Enterprise media search comes in: ingest the catalog into a similarity index, run one-to-many lookups on every new upload, and get top-K matches with scores in a single call. Same matching primitives, different access pattern. Public users stay on RapidAPI; enterprise direct API access is available after onboarding.

Ready to wire it in?

Subscribe on RapidAPI to call the public API on your own key, or talk to MediaLayer AI Labs about enterprise direct API access.