← all repositories
xNul/code-llama-for-vscode

Glue code that frees you from API keys

A tiny Flask shim lets you run Meta's Code Llama inside VS Code without signing up for anything.

code-llama-for-vscode
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

What it does

llamacpp_mock_api.py is a single-file Flask server that impersonates llama.cpp’s API. The Continue VS Code extension thinks it’s talking to llama.cpp, but it’s actually talking to Meta’s official codellama inference code. You get local code completion with no API key, no cloud service, and no Ollama.

The interesting bit

The whole project is literally one Python file. The author admits it’s “glue” in the README, but it’s glue that solves a real platform gap: Ollama doesn’t support Windows or Linux, and Continue’s native llama.cpp provider expects a different interface than Meta’s torchrun setup. This shim bridges the two without touching either codebase.

Key highlights

  • Single-file implementation (llamacpp_mock_api.py)
  • Cross-platform wherever Meta’s codellama runs (explicitly: Windows and Linux, unlike Ollama)
  • Zero API keys or account signups required
  • Works with Continue’s existing llama.cpp configuration—just swap the model name to codellama-7b
  • Requires only Flask as an additional dependency

Caveats

  • You must already have Meta’s codellama running independently; this doesn’t bundle or simplify the model setup
  • The README’s “as of the time of writing” caveat suggests the landscape may have shifted since writing

Verdict

Worth 10 minutes if you’re already running Code Llama locally and want VS Code integration without Ollama’s platform limits. Skip it if you want a one-click installer or if you’re on macOS where Ollama works fine anyway.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.