← all repositories

Maximilian-Winter/llama-cpp-agent

An agent framework enabling function calling, structured output, and RAG for LLMs via guided sampling.

llama-cpp-agent
Velocity · 7d
+0.7
★ / day
Trend
steady
star history

The llama-cpp-agent framework provides an interface for building LLM-powered agents with support for single and parallel function calling, structured output generation, and retrieval augmented generation with ColBERT reranking. It uses guided sampling via grammars and JSON schema to enable function calling even on models not fine-tuned for it. The framework integrates with llama.cpp, llama-cpp-python, TGI, and vllm servers, and offers conversational, sequential, and mapping agent chains.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.