Running 18 18 TravelPlannerLeaderboard π» Display and submit evaluation results for travel planning
Running 543 543 Vision Arena (Testing VLMs side-by-side) πΌ Analyze images to detect and label objects