Is whichllm open source?

Yes — Andyyyy64/whichllm is open source, released under the MIT license.

What language is whichllm written in?

Andyyyy64/whichllm is primarily written in Python.

How popular is whichllm?

Andyyyy64/whichllm has 6k stars on GitHub and is currently cooling off.

Where can I find whichllm?

Andyyyy64/whichllm is on GitHub at https://github.com/Andyyyy64/whichllm.

← all repositories

Andyyyy64/whichllm

Your GPU's best local LLM, chosen by benchmarks not size

whichllm auto-detects your hardware and ranks local LLMs by merged, confidence-graded benchmarks instead of parameter count, so you run the best model that fits, not merely the biggest.

★6k stars Python Inference · Serving LLMOps · Eval

View on GitHub ↗

Velocity · 7d

+19

★ / day

Trend

↘cooling

star history

What it does

whichllm inspects your GPU, CPU, RAM, and OS, then queries live HuggingFace data to rank local LLMs by a 0–100 score that weights benchmark quality, model size, quantization penalty, evidence confidence, and estimated inference speed. It can also simulate hardware you do not yet own, reverse-calculate the GPU needed for a specific model, and spawn an isolated chat session or emit copy-paste Python snippets.

The interesting bit

The scoring pipeline is unusually paranoid about benchmark inflation: it merges LiveBench, Artificial Analysis, Aider, and other sources, then demotes stale leaderboards along model lineages and rejects score inheritance when a fork’s parameters diverge more than 2× from its family’s dominant member. That means a 27B model can outrank a 32B model that fits just fine because the smaller one actually scores higher on current benchmarks.

Key highlights

Auto-detects NVIDIA, AMD, Apple Silicon, or CPU-only setups and estimates VRAM use down to KV cache and activation overhead
Ranks by evidence confidence (direct, variant, interpolated, self-reported) with multiplicative discounts; self-reported scores are heavily down-weighted
Simulates arbitrary GPUs for pre-purchase planning and compares upgrade candidates side by side
Fetches live HuggingFace data with curated frozen fallbacks for offline or rate-limited use
Emits JSON for shell pipelines and can launch models directly without manual dependency wrangling

Caveats

Ollama integration requires a manual mapping step because Ollama model names do not always match HuggingFace repo IDs
Speed estimates are planning ranges derived from bandwidth and quantization heuristics, not live runtime benchmarks
The tool is explicitly not a TUI; if you wanted a keyboard-driven interface, this is not it

Verdict

Worth a look if you are tired of manually cross-referencing VRAM spreadsheets with stale leaderboards. Skip it if you already have a curated local setup and do not need automated hardware detection or benchmark merging.

Frequently asked

What is Andyyyy64/whichllm?: whichllm auto-detects your hardware and ranks local LLMs by merged, confidence-graded benchmarks instead of parameter count, so you run the best model that fits, not merely the biggest.
Is whichllm open source?: Yes — Andyyyy64/whichllm is open source, released under the MIT license.
What language is whichllm written in?: Andyyyy64/whichllm is primarily written in Python.
How popular is whichllm?: Andyyyy64/whichllm has 6k stars on GitHub and is currently cooling off.
Where can I find whichllm?: Andyyyy64/whichllm is on GitHub at https://github.com/Andyyyy64/whichllm.