AI Service Build Computer Vision Vector Search ML Systems

Four capabilities, one platform,
cost optimised: visual intelligence
at broadcast scale.

A media library containing tens of thousands of video and image assets. A hard cost constraint. A four-capability visual search platform was built that standard tooling would have charged 10× more to run.

10× cheaper than standard vector DBs

$0 compute cost at idle

4 visual intelligence capabilities

Days→Hours processing time reduction

Engagement Snapshot

Sector	Ad Tech / Media Intelligence
Service	Platform Design & Engineering
Scale	Tens of thousands of video & image assets
Cost outcome	10× reduction
Processing time	Days to hours — distributed and scalable

Key Highlights

→ 10× cheaper than standard vector database approaches

→ Zero idle cost — distributed serverless compute scales to zero when not in use

→ Four distinct capabilities — frame search, semantic search, structural similarity, image equivalents

→ S3 vector buckets + optimised storage — cost-efficient retrieval without a dedicated vector DB

→ Scales to thousands of queries on demand — spiky media workloads handled without over-provisioning

Context & Challenge

The media library contained tens of thousands of video and image assets per client — and it grows constantly. A single campaign generates multiple versions of the same creative: different durations, crops for different placements, regional re-edits, localised audio. Finding, grouping, and retrieving related assets was slow, manual, and increasingly unworkable at scale.

The product requirement was clear: build a visual intelligence layer that lets clients search and discover assets intelligently — not just by filename or metadata, but by what the content actually is and looks like. Four capabilities were needed: frame-level retrieval, semantic video search, structural similarity across versions, and the same suite for images.

The engineering constraint was equally clear: the system had to be cheap to run. Media query loads are spiky — bursts of activity around campaign launches, then silence. Standard vector database solutions would have cost $350–$460/month in always-on compute alone, regardless of actual usage. That wasn't acceptable. The architecture had to cost almost nothing at idle and scale on demand.

Architecture & Approach

The cost constraint wasn't
a limitation — it was the brief.

Stage 01 — Hypothesis

Design around the cost constraint from day one.

Rather than reaching for standard vector database tooling and optimising later, we designed the cost architecture before writing a line of application code. The goal: near-zero idle cost, with the ability to scale to thousands of queries on demand. That shaped every infrastructure decision that followed.

Stage 02 — Experiment

Benchmark the approach against real numbers.

Before committing, we validated the retrieval approach against real data and real costs. Storing ~70GB of vectors in AWS S3 vector buckets runs roughly $1–2/month, with 1,000 queries/month at $0.50–$2 — against OpenSearch Serverless ($350–$460/month) or Databricks Mosaic AI ($200–$900/month). The benchmark confirmed the order-of-magnitude saving was real, not theoretical, and set the architecture.

Stage 03 — Formulation

Build the storage and compute layers for scale.

With the approach proven, we built the production system: AWS S3 vector buckets with an optimised storage layout for retrieval without always-on compute, paired with distributed serverless inference that scales to zero at idle and to thousands of concurrent queries on demand.

✓Optimised storage layout to minimise per-query retrieval cost

✓No always-on compute — storage only, until a query arrives

✓$0 compute cost at idle — no reserved capacity, no wasted spend

✓Each of the four capabilities runs as an independent, composable service

Stage 04 — Execution

Live in client dashboards, scaling with demand.

The four visual intelligence capabilities run live in ExtremeReach's client dashboards. Media workloads are spiky — campaign launches drive sudden bursts followed by silence — and the serverless design absorbs that pattern, scaling within seconds and costing nothing when idle, with no reserved capacity or manual oversight.

Results & Impact

What changed.

Before Ferrous Labs	After Ferrous Labs
Manual asset search — slow, error-prone, and unscalable	Four distinct visual intelligence capabilities — live in client dashboards
No way to find related versions of the same creative	Frame search, semantic search, structural similarity, and image equivalents — one platform
No semantic or visual retrieval across the library	Order-of-magnitude lower running cost than standard approaches
Days to process 100K+ videos	Zero cost at idle — scales to thousands of queries on demand. Days of processing → hours.

The cost constraint wasn't a limitation — it was the brief. Standard tooling would have solved the problem and cost 10× more. Designing around the constraint forced better decisions at every layer of the stack.

Ferrous Labs engineering note

Technology

Stack

AWS S3 Vector Buckets AWS Lambda Faiss Qdrant PyTorch OpenCV Docker

Need a capability your engineers can call?

Talk to a co-founder.

If you're delivering CV, vector retrieval, or cost-engineered ML at scale — we've delivered this before. Book a discovery call.

Talk to a co-founder Find your best starting point

Four capabilities, one platform,cost optimised: visual intelligenceat broadcast scale.