Open-Source Hiring AI That Runs Locally and Explains Every Score

Staff Writer

HackerRank’s Hiring Agent rejects black-box SaaS in favor of local LLMs, GitHub signals, and auditable prompt templates.

interviewstreet/hiring-agent

★2.2k stars Velocity · 7d +156 ★/day

star history

View on GitHub ↗

The Screening Arms Race Has a Transparency Problem

The average corporate role now attracts roughly 257 applications, and manual screening consumes 23 hours per hire. Those figures explain why AI resume screening has become one of the most crowded corners of enterprise software. Adoption estimates suggest it climbed from 26 percent of organizations in 2024 to 43 percent in 2025, and vendors have responded with a dizzying fragmentation of approaches. Eightfold.ai maps millions of skills across billions of profiles. HireVue layers video interviews atop resume parsing. Workday’s Recruiting Agent promises a 54 percent boost in recruiter capacity. Glide, ZBrain, and Whippy.ai market “agentic” screeners that chat, score, and schedule.

Yet the same surveys that tout efficiency also warn of the trade-offs. Generative AI alone does not verify accuracy; it tends to reinforce whatever information is already present, true or inflated. Predictive models trained on historical hiring data learn past preferences, including demographic bias. And most platforms are cloud-native black boxes: the criteria live in proprietary models, opaque dashboards, and marketing brochures that boast of “algorithmic audits” and “Responsible AI badges.” Recruiters get speed, but they rarely get to inspect the logic.

HackerRank’s Counteroffer: Glue Code You Can Read

Against that backdrop, InterviewStreet—the parent entity behind HackerRank—released Hiring Agent, an open-source pipeline that treats resume evaluation as a local, auditable script rather than a managed service. The repository is unapologetically glue code: PyMuPDF converts PDF pages to Markdown-like text, an LLM extracts sectioned JSON through Jinja templates, a GitHub module fetches public repository signals, and an evaluator applies strict scoring rules with fairness constraints. The output is a structured score with evidence, bonus points, and deductions, optionally cached to CSV during development.

What makes it noteworthy is not the novelty of any single component. PDF parsing, LLM extraction, and API enrichment are standard fare. The difference is packaging and posture. While competitors sell “AI agents” and “orchestration platforms,” Hiring Agent gives you the orchestration script itself, written in Python, licensed under MIT, and designed to run on a laptop with Ollama. There is no per-seat pricing, no cloud dependency, and no vendor lock-in—only a requirements file and a choice of local or hosted models.

GitHub as a Signal, Not a Sidebar

Most resume screeners, even modern ones, rely heavily on keyword matching and inferred skills. A candidate writes “Python” and “machine learning,” and the system checks boxes. Hiring Agent adds a layer that is difficult to fake: live GitHub data. The module extracts a username from the resume, fetches public profile and repository metadata, classifies each project, and then asks the LLM to select exactly seven unique projects that meet a minimum author-commit threshold. The evaluation criteria explicitly score open_source, self_projects, production experience, and technical_skills.

For engineering hires, this is a materially different signal. Enterprise platforms like Eightfold.ai or Workday may approximate technical proficiency through skills ontologies, but they rarely surface raw commit graphs or ask a model to judge whether a candidate’s side projects demonstrate meaningful ownership. By binding the resume narrative to verifiable public artifacts, Hiring Agent narrows the gap between what a candidate claims and what their code actually shows. It is a narrow use case—non-developers have no GitHub footprint—but for technical recruiting, it is a richer form of evidence than keyword density.

Prompts as Policy Documents

Perhaps the most genuinely special detail hides in the prompts directory. Every extraction instruction and scoring rule is written in Jinja templates stored as plain text. The evaluator does not rely on a fine-tuned neural network trained on years of proprietary hiring decisions. Instead, it feeds structured resume data and GitHub metadata into a prompt that encodes fairness constraints, category weights, and evidentiary standards.

This is a declarative alternative to the black-box models that dominate the market. Findem and other analysts note that predictive screening tools often repeat historical biases because the bias is baked into the training data. Hiring Agent sidesteps that problem by making the policy visible. If a candidate receives a deduction for an employment gap or a bonus for open source contributions, the rule lives in a template that can be version-controlled, peer-reviewed, and audited. In an industry where transparency is usually a marketing slide, open prompts are a small but meaningful act of resistance.

The README underscores this by asking contributors to “keep prompts declarative and provider-agnostic.” That constraint acknowledges a hard maintenance burden—different LLMs interpret the same template differently—but it preserves the project’s central promise: the hiring criteria belong to the organization, not to the vendor.

Local-First in a Cloud-First Market

Hiring Agent supports Google Gemini, but its default posture is local. The README documents Ollama integration for Gemma 3 models ranging from 1B to 12B parameters, and the entire pipeline can run without an external API key. That design choice carries real consequences. Resumes never leave the machine, which sidesteps data-residency headaches and API costs. It also means a hiring manager can iterate on prompts and scoring rules without burning cloud tokens or waiting on SaaS release cycles.

The trade-off is consistency. A 4B local model may hallucinate section boundaries or misread a GitHub repository’s purpose in ways that a hosted frontier model would not. The repository acknowledges this pragmatically: it caches intermediate JSON and CSV outputs in development mode so users can inspect and correct errors before finalizing a score. This is not lights-out automation. It is a drafting table.

The Reality Check

It is worth stating plainly what Hiring Agent is not. It does not schedule interviews, chat with applicants, or integrate bidirectionally with an ATS. It does not offer the enterprise dashboards of Workday, the diversity analytics of Eightfold.ai, or the multi-modal assessments of HireVue. Its CSV export and stdout summary suggest a tool built for internal experimentation or startup hiring workflows, not for Fortune 500 HR stacks.

Nor is the architecture itself unprecedented. PyMuPDF, Pydantic, Jinja, and the GitHub API are familiar building blocks. The value lies in the assembly: a fully local, explainable, developer-centric scorer released by a company that built its brand on technical assessments. It is a reference implementation as much as a product.

Unresolved Tensions

The project leaves several hard questions open. Can Jinja templates remain stable across Ollama and Gemini when model behavior drifts? The README already warns contributors to validate changes against real resumes under different providers, which is a labor-intensive practice that SaaS vendors absorb on behalf of their customers. And while the prompts are transparent, the LLM interpreting them is still a stochastic black box. A fairness constraint written in English can be ignored or misinterpreted by a small local model.

There is also the question of scope. Hiring Agent evaluates; it does not discover. It assumes a resume has already arrived. In a market moving toward agentic sourcing and proactive talent rediscovery, a local PDF scorer is a narrow tool. But narrowness can be a virtue. By refusing to become an all-in-one platform, Hiring Agent offers something the enterprise suites do not: a small, inspectable piece of infrastructure that teams can own, modify, and distrust on their own terms.

For now, Hiring Agent sits at an interesting intersection: too opinionated to be a universal framework, too modest to be an enterprise platform, yet too transparent to be ignored by engineering teams who would rather read a Jinja template than a vendor’s trust-center whitepaper.