← all repositories
techjarves/USB-Uncensored-LLM

A pocket-sized AI that won't scold you

This project wraps Ollama, portable Python, and a small HTTP server into a cross-platform bundle that fits on a USB stick.

USB-Uncensored-LLM
Velocity · 7d
+26
★ / day
Trend
steady
star history

What it does USB-Uncensored-LLM is a portable local AI environment designed to run from a USB drive or SSD without installation. It bundles OS-specific scripts, a portable Python runtime, and a custom Ollama engine to serve a web-based chat UI. Models and chat history live in a Shared folder, so you can carry them between Windows, macOS, Linux, and even Android (via Termux).

The interesting bit The “uncensored” angle is the project’s stated selling point: it ships with curated “ablative” and “heretic” fine-tuned models whose safety alignment has been stripped out. Whether that’s a feature or a bug depends entirely on your threat model and ethics. The technical cleverness is the Shared volume architecture — one copy of multi-gigabyte model weights, accessed natively by different OS binaries.

Key highlights

  • Zero system installation: runs from a folder, no registry edits or package managers
  • Cross-platform Shared folder for models, chat history, and portable Python
  • Curated model installer (Windows batch file) for GGUF weights from HuggingFace
  • LAN access: phone or tablet on same WiFi can reach the host machine’s IP on port 3333
  • Android support through Termux, though the 2B model is the only practical option for most phones

Caveats

  • The README’s “zero-dependency” claim is slightly overstated: initial setup downloads a ~50MB engine per OS, and model downloads require internet (or manual file placement)
  • Windows is the recommended path for the interactive model downloader; other OS users manually copy .gguf files
  • Android performance is modest: 3–10 tokens/sec on the 2B model, with aggressive battery drain

Verdict Worth a look if you need a truly portable, air-gapped LLM setup across multiple machines — or if you’re specifically seeking uncensored models without cloud dependencies. Skip it if you already have a comfortable local Ollama workflow; this is largely a packaging and distribution convenience layer.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.