All posts
Engineering·May 7, 2026·4 min read

How we catch scam and ghost jobs before scoring

Spend ten minutes browsing job boards and you'll see the pattern. "Earn $5,000/week from home, no experience needed, send $100 for training materials." "We're always looking for talented engineers — submit your resume to our pipeline." "URGENT!!! Senior React Developer!!!" These three are different kinds of bad: outright scam, ghost job (the company isn't really hiring), low-effort listing.

If our agent surfaces these on your dashboard, we've failed. Not only do you lose trust in our recommendations, you waste time on something that was never going to be real. So we built a multi-layer validator that runs before the LLM scorer — pure regex and string ops, no AI cost, no AI latency.

The seven layers

  • Title spam — exclamation floods, all-caps titles, urgency words like URGENT or APPLY NOW.
  • Scam keywords — payment requests, unrealistic earnings, money-mule patterns ("package reshipper"), off-platform interview channels ("WhatsApp only").
  • Ghost-job indicators — "always looking," "talent pool," "rolling basis," "evergreen position."
  • Suspicious domains — free TLDs (.tk, .ml, .ga) and numbered domain patterns like jobs-123.com.
  • Salary reality — for the role's seniority, a salary range more than 2x above market is a strong scam signal; one suspiciously below market floor is too.
  • Description quality — under 50 words, or no requirements section, almost never corresponds to a real opening.
  • Contact verification — a Razorpay "recruiter" emailing from a personal Gmail address gets flagged.

How it actually works

Each layer adds to a cumulative risk score from 0 to 100. At 60+ we reject the listing entirely and mark it as scam in our database — which means the next user who fetches the same posting from the same source gets the result instantly without re-validating. At 30-59 the job passes but logs a warning we use to tune the regexes over time.

Cross-source duplicate detection is layer 8. The same Razorpay role can show up via both the Greenhouse fetcher (because Razorpay uses Greenhouse) and via Kimi's web search (because the careers page also surfaced in a search result). A fingerprint of normalized title + company + location collapses these to one row before scoring.

All of this happens before the LLM scorer runs. So we don't pay Kimi tokens to score a listing we'd reject anyway, AND no scam ever reaches your dashboard. It's the kind of work that's invisible when it works — which is the goal.

Ready to try the agent?

Two-day free trial. No credit card. Set up your profile, connect Gmail, and let the daily run start surfacing matches.