OSU-NLP-Group/TravelPlanner
A benchmark suite for evaluating language agents on real-world multi-step planning tasks in travel planning scenarios.

Velocity · 7d
+0.6
★ / day
Trend
→steady
star history
TravelPlanner is a benchmark that evaluates how well language agents (LLM-based autonomous systems) can perform complex, real-world planning tasks such as trip scheduling. The repository provides an evaluation environment, dataset, reference database, format checking tools, and fine-tuned model checkpoints. It includes a leaderboard comparing various LLMs on agentic planning performance and uses LLama-Factory for model fine-tuning.