← all repositories

NVIDIA/ChatRTX

NVIDIA demo app for running local RAG chatbots on Windows with TensorRT-LLM acceleration.

ChatRTX
Velocity · 7d
+3.2
★ / day
Trend
steady
star history

ChatRTX is a RAG (Retrieval Augmented Generation) chatbot application that runs entirely on Windows RTX PCs. It uses TensorRT-LLM and NVIDIA NIM microservices for accelerated LLM inference and supports a range of open-weight models including Llama 3.1, Mistral 7B, ChatGLM3, and LLaMa 2, with Whisper for voice input and CLIP for image understanding. Users point the app at a folder of documents and can query them through text or voice for fast, contextually relevant answers.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.