← all repositories

princeton-nlp/WebShop

A simulated e-commerce website environment with 1.18 million products and 12,087 instructions for benchmarking language agents.

549 stars Python AgentsDomain Apps
WebShop
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

WebShop provides a realistic web interaction benchmark where agents must navigate webpages, search for products, and make purchases based on natural language instructions. It challenges agents with compositional instruction understanding, query reformulation, and handling noisy information. The environment is used to train and evaluate reinforcement learning and language-grounded agents in a sim-to-real framework.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.