AI Service Build Computer Vision Vector Search ML Systems

Four capabilities, one platform,
cost optimised: visual intelligence
at broadcast scale.

A media library containing tens of thousands of video and image assets. A hard cost constraint. A four-capability visual search platform was built that standard tooling would have charged 10× more to run.

10× cheaper than standard vector DBs
$0 compute cost at idle
4 visual intelligence capabilities
Days→Hours processing time reduction
Engagement Snapshot
SectorAd Tech / Media Intelligence
ServicePlatform Design & Engineering
ScaleTens of thousands of video & image assets
Cost outcome10× reduction
Processing timeDays to hours — distributed and scalable

Key Highlights

10× cheaper than standard vector database approaches
Zero idle cost — distributed serverless compute scales to zero when not in use
Four distinct capabilities — frame search, semantic search, structural similarity, image equivalents
S3 vector buckets + optimised storage — cost-efficient retrieval without a dedicated vector DB
Scales to thousands of queries on demand — spiky media workloads handled without over-provisioning
Context & Challenge

The media library contained tens of thousands of video and image assets per client — and it grows constantly. A single campaign generates multiple versions of the same creative: different durations, crops for different placements, regional re-edits, localised audio. Finding, grouping, and retrieving related assets was slow, manual, and increasingly unworkable at scale.

The product requirement was clear: build a visual intelligence layer that lets clients search and discover assets intelligently — not just by filename or metadata, but by what the content actually is and looks like. Four capabilities were needed: frame-level retrieval, semantic video search, structural similarity across versions, and the same suite for images.

The engineering constraint was equally clear: the system had to be cheap to run. Media query loads are spiky — bursts of activity around campaign launches, then silence. Standard vector database solutions would have cost $350–$460/month in always-on compute alone, regardless of actual usage. That wasn't acceptable. The architecture had to cost almost nothing at idle and scale on demand.

Architecture & Approach

The cost constraint wasn't
a limitation — it was the brief.

Stage 01 — Constraint

Design around the cost constraint from day one.

Rather than reaching for standard vector database tooling and optimising later, we designed the cost architecture before writing a line of application code. The goal: near-zero idle cost, with the ability to scale to thousands of queries on demand. That shaped every infrastructure decision that followed.

Stage 02 — Store

S3 vector buckets — not a vector database.

We used AWS S3 vector buckets combined with an optimised storage layout to handle vector retrieval — without the always-on compute cost of a dedicated vector database. Storage for ~70GB of vectors costs roughly $1–2/month. Query costs at 1,000/month run to $0.50–$2. Compared to OpenSearch Serverless ($350–$460/month) or Databricks Mosaic AI ($200–$900/month), the difference is an order of magnitude.

Optimised storage layout to minimise per-query retrieval cost
No always-on compute — storage only, until a query arrives
Indexing and sync costs near-zero
Stage 03 — Compute

Distributed serverless — zero at idle, thousands on demand.

Inference runs on distributed serverless compute that scales to zero when not in use and handles thousands of concurrent queries when needed. Media workloads are spiky — campaign launches drive sudden bursts of activity, followed by silence. Serverless handles that pattern perfectly. Always-on compute would be paying for capacity that sits idle 80% of the time.

$0 compute cost at idle — no reserved capacity, no wasted spend
Scales to thousands of concurrent queries within seconds
Each of the four capabilities runs as an independent, composable service
Results & Impact

What changed.

Before Ferrous Labs After Ferrous Labs
Manual asset search — slow, error-prone, and unscalable Four distinct visual intelligence capabilities — live in client dashboards
No way to find related versions of the same creative Frame search, semantic search, structural similarity, and image equivalents — one platform
No semantic or visual retrieval across the library Order-of-magnitude lower running cost than standard approaches
Days to process 100K+ videos Zero cost at idle — scales to thousands of queries on demand. Days of processing → hours.
The cost constraint wasn't a limitation — it was the brief. Standard tooling would have solved the problem and cost 10× more. Designing around the constraint forced better decisions at every layer of the stack.
Ferrous Labs engineering note
Technology

Stack

AWS S3 Vector Buckets AWS Lambda Faiss Qdrant PyTorch OpenCV Docker
Need a capability your engineers can call?

Talk to engineering.

If you're delivering CV, vector retrieval, or cost-engineered ML at scale — we've delivered this before. Book a discovery call.