When you build a product around AI in 2026, the default assumption is that you'll route different tasks to different model providers. GPT-4 for writing. Claude for long-context reasoning. Gemini for search. Open-source for cheap classification. The argument is that each model has its strengths and you should pick whichever wins on a given benchmark.
We chose differently. Every agent in Aplyqk routes to Moonshot Kimi K2.6 — match scoring, web-search-driven job discovery, cover letter generation, reply classification, draft replies, interview detail extraction, resume parsing, resume polish. One model end to end.
Why one provider
- One billing relationship. Multi-provider setups become a tax burden the moment you have to reconcile usage across three services.
- One voice. If our cover letter writer runs on GPT and our reply drafter runs on Claude, recruiters see two different writing styles in the same conversation thread. Subtle, but they notice.
- One prompt-engineering surface. Tuning a multi-agent system across providers means version-controlling N sets of prompts that each behave slightly differently. We touch one set.
- Cost predictability. Kimi K2.6 is roughly 5x cheaper than GPT-4 for our task mix. At our cost target ($0.30 per user per day), no other model gets us there without aggressive rate limiting that would degrade the product.
Why Kimi specifically
The differentiator is native tool calling and $web_search. Aplyqk's job hunter agent literally needs to browse the open web — LinkedIn, careers pages, Hacker News "Who's Hiring" — and pull back current openings. Other major models either don't expose web search through their API at all, or do so behind cumbersome wrappers. Kimi makes it a first-class capability.
For everything else (classification, structured extraction, short-form professional writing), Kimi K2.6 is genuinely competitive with the frontier models. The places where another model might edge it out — say, GPT-4 on creative long-form prose — aren't where this product spends its tokens.
What we'd do differently if scale demanded it
Our architecture isn't religiously single-provider. Every agent in our config maps to a provider name in one YAML file. If we ever find that a specific agent — say, cover letter writing — measurably improves on Claude, flipping that one mapping is one line of YAML. We just haven't seen that signal yet, and "keep it boring" wins until we do.