← all repositories

OSU-NLP-Group/TravelPlanner

A benchmark suite for evaluating language agents on real-world multi-step planning tasks in travel planning scenarios.

TravelPlanner
Velocity · 7d
+0.6
★ / day
Trend
steady
star history

TravelPlanner is a benchmark that evaluates how well language agents (LLM-based autonomous systems) can perform complex, real-world planning tasks such as trip scheduling. The repository provides an evaluation environment, dataset, reference database, format checking tools, and fine-tuned model checkpoints. It includes a leaderboard comparing various LLMs on agentic planning performance and uses LLama-Factory for model fine-tuning.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.