--- title: Shopping MMLU Leaderboard emoji: 🌎 colorFrom: blue colorTo: green sdk: gradio sdk_version: 4.44.1 app_file: app.py pinned: true license: apache-2.0 tags: - leaderboard short_description: 'Massive Multi-Task LLM Benchmark for Online Shopping' --- In this leaderboard, we display evaluation results obtained with Shopping MMLU. The space provides an overall leaderboard, consisting of 4 main online shopping skills: - Shopping Concept Understanding - Shopping Knowledge Reasoning - User Behavior Alignment - Multi-lingual Abilities Github: https://github.com/KL4805/ShoppingMMLU Report: https://arxiv.org/abs/2410.20745 Please consider to cite the report if the resource is useful to your research: ```BibTex ```