← all repositories

cli99/llm-analysis

A Python tool that estimates training and inference latency and memory consumption for transformer-based language models given GPU, data type, and parallelism configurations.

487 stars Python LLMOps · Eval
llm-analysis
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

The project automates performance estimation for large language models by computing FLOPs, memory usage, and latency based on model, hardware, data type, and parallelism settings. It provides both a Python API (LLMAnalysis class) and command-line entry points for quick calculations. The tool covers parallelism schemes, activation recomputation, data types, and inference assumptions to help ML engineers plan resource requirements for LLM training and serving.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.