NVIDIA/ChatRTX
NVIDIA demo app for running local RAG chatbots on Windows with TensorRT-LLM acceleration.

Velocity · 7d
+3.2
★ / day
Trend
→steady
star history
ChatRTX is a RAG (Retrieval Augmented Generation) chatbot application that runs entirely on Windows RTX PCs. It uses TensorRT-LLM and NVIDIA NIM microservices for accelerated LLM inference and supports a range of open-weight models including Llama 3.1, Mistral 7B, ChatGLM3, and LLaMa 2, with Whisper for voice input and CLIP for image understanding. Users point the app at a folder of documents and can query them through text or voice for fast, contextually relevant answers.