coderonion/awesome-cuda-and-hpc
Curated list of CUDA, HPC, and GPU optimization projects for machine learning including TensorRT-LLM, Triton, TVM, and LLM-related frameworks.

This repository aggregates open-source projects focused on GPU computing and high-performance computing infrastructure used by ML/AI systems. It catalogs tools for model inference optimization (TensorRT, TensorRT-LLM), kernel programming (Triton, CUTLASS), compiler toolchains (TVM, MLIR, LLVM), and distributed training (NCCL). The list includes frameworks, benchmarks, and learning resources for deploying and optimizing large language models and vision models on GPU hardware.