Candidate Relevancy:
The Metric AI Sourcing Was Missing


Jerry Sahon · November 2025

🕐 · 5 min read

AI Sourcing / HR Tech / People search

When you’re inventing something new, you can’t rely only on standard metrics.

Most tools in AI sourcing today just wrap filters in a nice interface. Some use semantic search. Some call themselves “AI.” But almost none can answer the most important question:
Are the people in the top results actually the right people?
That question sounds simple, but it’s not — and we spoke about it briefly in our AI Sourcing Tools Benchmark.
Relevance in people search is messy. It’s not about matching titles or buzzwords — it’s about fit, context, intent, trajectory. About understanding what makes someone a true match, not just a match on paper. It’s what defines real candidate sourcing quality.
We needed a new compass
So we built our own metric.
Internally, we call it Relevancy.

Of course, we also track classic sourcing metrics like NDCG, Precision, and Recall — those are important for internal performance and research.
But for us — and for our users — there’s one number that truly matters: Candidate Relevancy.

Relevancy isn’t just about skills or keywords. We went deeper.
It’s about how people grow, what drives them, how their career stories align with the role behind the query.
That’s what makes it not just another candidate relevancy metric, but something closer to a human benchmark for fit.
Inside the team, we simply call it relevancy — lowercase, because it’s become part of our daily language.
How we built it
Relevancy measures how useful our search results really are — as judged by recruiters, not by machines.
Technically, it’s the share of high-quality candidates in the top results for a query.
If nine out of ten are a perfect match, that means the system fully understood the request — every detail of it.
That’s what we now call recruiting search quality — the ability to surface the right profiles fast, with context and accuracy.

Behind that simple definition hides months of work:
We hand-labeled thousands of candidates from real recruiter queries.
We invited professional recruiters to review results, so our evaluations aligned with the real market and real workflows.
We created internal annotation guidelines from scratch — structured but human.
We trained ourselves to think both like a machine and like a recruiter.

We didn’t just ask “Would you interview this person?
We built structured decision programs around role fit, career trajectory, and common recruiting biases — identifying the ones that matter, the ones that hurt, and trying to encode that collective human intuition into something measurable.
🎯 The goal was to find a common decision logic — an internal “grammar” of fit and a foundation for true candidate sourcing metrics.
All best-practice technologies of the industry and the global ML community gave us only 0.43..
When we launched our first version of Relevancy, the score was 0.43.
That means less than half of the top results were consistently relevant.
It was painful to see, but it was real — and it gave us a place to start.
And like in Formula 1, the closer you get to perfection, the harder it becomes to move forward.
The first tens of percentage points came from major innovations — new architectures, smarter indexing, better heuristics. Then progress slowed.

Each new 10% took months.
Then 3%. Then 1%..

At some point, every fraction of improvement demanded insane creativity, coordination, and breakthrough ideas — like a Formula 1 team redesigning the entire car just to gain a tenth of a second on the finish line.
Real tech, not a hype product
Relevancy became our north star.
It was on every dashboard, every whiteboard, every conversation.
It guided our roadmap, our priorities, our experiments.
We’re not building a hype product.
We’re building a real technology — an infrastructure layer that powers sourcing itself, and sets new standards for candidate search performance.

And eventually, we reached levels we didn’t think were possible at the start — high enough to define our own people search benchmark.

➡️ In the next article, we’ll share how far we’ve come — the numbers we hit, what they mean for us and the industry, and what comes next.

Because once you define a benchmark, you also take responsibility for where the bar is set.


To see where the bar is now, check our product live — the latest updates are always delivered there

Continue Exploring