← all repositories

huggingface/evaluate

A Python library from Hugging Face for standardized evaluation of machine learning models using plug-in metrics and measurement tools.

2.5k stars Python LLMOps · Eval
evaluate
Velocity · 7d
+1.6
★ / day
Trend
steady
star history

🤗 Evaluate is a library that makes evaluating and comparing models and reporting their performance easier and more standardized. It provides implementations of dozens of popular metrics spanning NLP to Computer Vision tasks, allowing users to load metrics like accuracy = load("accuracy") and evaluate ML models across any framework including Numpy, Pandas, PyTorch, TensorFlow, and JAX. The library also offers comparison and measurement tools to evaluate model performance.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.