Alibaba's cure for LLMs that hallucinate code review comments
A hybrid CLI tool that uses deterministic pipelines to keep LLM agents from drifting off-target during code review.

What it does
Open Code Review is a CLI tool that reads Git diffs, feeds changed files to an LLM agent, and generates line-level review comments. It started as Alibaba’s internal AI code review assistant, reportedly serving tens of thousands of developers and flagging millions of defects before being open-sourced. You install it via npm or a native binary, point it at an OpenAI or Anthropic endpoint, and run ocr review.
The interesting bit
The project treats pure LLM agents as unreliable for code review — they “cut corners” on large changesets, drift from actual line numbers, and wobble with prompt tweaks. The fix is a hybrid architecture: deterministic engineering handles file selection, bundling, rule matching, and comment positioning, while the agent only does dynamic context retrieval and decision-making. It’s essentially guardrails with an LLM inside, not an LLM with guardrails bolted on.
Key highlights
- Built-in ruleset — Fine-tuned templates for NPEs, thread-safety, XSS, and SQL injection, matched to files via template engine rather than natural language.
- Smart bundling — Groups related files (e.g., i18n property files) into sub-agents with isolated context, enabling concurrent review of large changesets.
- External positioning module — Separate reflection and positioning modules try to fix the “line numbers are wrong” problem endemic to LLM reviews.
- Multi-integration — CLI, CI/CD (GitHub Actions/GitLab CI examples), Claude Code plugin, and generic agent skill via
npx skills. - Viewer — Built-in web UI (
ocr viewer) to inspect raw LLM request/response traces for debugging.
Caveats
- The “millions of defects” and “tens of thousands of developers” claims are from Alibaba’s self-reported internal usage; no independent verification is provided.
- The README is truncated mid-sentence on viewer security details, so the full allowlist behavior is unclear.
- Requires bringing your own LLM endpoint and API key; no bundled model or free tier.
Verdict
Worth a look if you’re running code review at scale and have burned hours debugging why Claude Code’s line numbers don’t match reality. Probably overkill for small teams already happy with GitHub Copilot’s pass/fail vibe.