princeton-nlp/WebShop
A simulated e-commerce website environment with 1.18 million products and 12,087 instructions for benchmarking language agents.

Velocity · 7d
+0.4
★ / day
Trend
→steady
star history
WebShop provides a realistic web interaction benchmark where agents must navigate webpages, search for products, and make purchases based on natural language instructions. It challenges agents with compositional instruction understanding, query reformulation, and handling noisy information. The environment is used to train and evaluate reinforcement learning and language-grounded agents in a sim-to-real framework.