AI Service Build Computer Vision Vector Search ML Systems

Four capabilities, one platform,
cost optimised: visual intelligence
at broadcast scale.

A media library containing tens of thousands of video and image assets. A hard cost constraint. A four-capability visual search platform was built that standard tooling would have charged 10× more to run.

10× cheaper than standard vector DBs

$0 compute cost at idle

4 visual intelligence capabilities

Days→Hours processing time reduction

Engagement Snapshot

Sector	Ad Tech / Media Intelligence
Service	Platform Design & Engineering
Scale	Tens of thousands of video & image assets
Cost outcome	10× reduction
Processing time	Days to hours — distributed and scalable

Key Highlights

→ 10× cheaper than standard vector database approaches

→ Zero idle cost — distributed serverless compute scales to zero when not in use

→ Four distinct capabilities — frame search, semantic search, structural similarity, image equivalents

→ S3 vector buckets + optimised storage — cost-efficient retrieval without a dedicated vector DB

→ Scales to thousands of queries on demand — spiky media workloads handled without over-provisioning

Context & Challenge

The media library contained tens of thousands of video and image assets per client — and it grows constantly. A single campaign generates multiple versions of the same creative: different durations, crops for different placements, regional re-edits, localised audio. Finding, grouping, and retrieving related assets was slow, manual, and increasingly unworkable at scale.

The product requirement was clear: build a visual intelligence layer that lets clients search and discover assets intelligently — not just by filename or metadata, but by what the content actually is and looks like. Four capabilities were needed: frame-level retrieval, semantic video search, structural similarity across versions, and the same suite for images.

The engineering constraint was equally clear: the system had to be cheap to run. Media query loads are spiky — bursts of activity around campaign launches, then silence. Standard vector database solutions would have cost $350–$460/month in always-on compute alone, regardless of actual usage. That wasn't acceptable. The architecture had to cost almost nothing at idle and scale on demand.

Architecture & Approach

The cost constraint wasn't
a limitation — it was the brief.

Stage 01 — Constraint

Design around the cost constraint from day one.

Rather than reaching for standard vector database tooling and optimising later, we designed the cost architecture before writing a line of application code. The goal: near-zero idle cost, with the ability to scale to thousands of queries on demand. That shaped every infrastructure decision that followed.

Stage 02 — Store

S3 vector buckets — not a vector database.

We used AWS S3 vector buckets combined with an optimised storage layout to handle vector retrieval — without the always-on compute cost of a dedicated vector database. Storage for ~70GB of vectors costs roughly $1–2/month. Query costs at 1,000/month run to $0.50–$2. Compared to OpenSearch Serverless ($350–$460/month) or Databricks Mosaic AI ($200–$900/month), the difference is an order of magnitude.

✓Optimised storage layout to minimise per-query retrieval cost

✓No always-on compute — storage only, until a query arrives

✓Indexing and sync costs near-zero

Stage 03 — Compute

Distributed serverless — zero at idle, thousands on demand.

Inference runs on distributed serverless compute that scales to zero when not in use and handles thousands of concurrent queries when needed. Media workloads are spiky — campaign launches drive sudden bursts of activity, followed by silence. Serverless handles that pattern perfectly. Always-on compute would be paying for capacity that sits idle 80% of the time.

✓$0 compute cost at idle — no reserved capacity, no wasted spend

✓Scales to thousands of concurrent queries within seconds

✓Each of the four capabilities runs as an independent, composable service

Results & Impact

What changed.

Before Ferrous Labs	After Ferrous Labs
Manual asset search — slow, error-prone, and unscalable	Four distinct visual intelligence capabilities — live in client dashboards
No way to find related versions of the same creative	Frame search, semantic search, structural similarity, and image equivalents — one platform
No semantic or visual retrieval across the library	Order-of-magnitude lower running cost than standard approaches
Days to process 100K+ videos	Zero cost at idle — scales to thousands of queries on demand. Days of processing → hours.

The cost constraint wasn't a limitation — it was the brief. Standard tooling would have solved the problem and cost 10× more. Designing around the constraint forced better decisions at every layer of the stack.

Ferrous Labs engineering note

Technology

Stack

AWS S3 Vector Buckets AWS Lambda Faiss Qdrant PyTorch OpenCV Docker

Need a capability your engineers can call?

Talk to engineering.

If you're delivering CV, vector retrieval, or cost-engineered ML at scale — we've delivered this before. Book a discovery call.

Talk to engineering Get your Free AI Strategy

Four capabilities, one platform,cost optimised: visual intelligenceat broadcast scale.