JMMMU

non-profit

https://mmmu-japanese-benchmark.github.io/JMMMU/

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

AtsuMiyai new activity 23 days ago

JMMMU/JMMMU:Typo in validation_Agriculture_1

AtsuMiyai updated a Space 26 days ago

JMMMU/JMMMU_Leaderboard

yuexiang96 authored a paper about 2 months ago

Demystifying Long Chain-of-Thought Reasoning in LLMs

View all activity

JMMMU's activity

AtsuMiyai

in JMMMU/JMMMU 23 days ago

Typo in validation_Agriculture_1

#3 opened about 2 months ago by

d-sato

AtsuMiyai

updated a Space 26 days ago

JMMMU Leaderboard

🥇

Evaluating LMMs on Japanese subjects

yuexiang96

authored 2 papers about 2 months ago

Demystifying Long Chain-of-Thought Reasoning in LLMs

Paper • 2502.03373 • Published Feb 5 • 58

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

Paper • 2501.17703 • Published Jan 29 • 57

yuexiang96

authored a paper 2 months ago

Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos

Paper • 2501.13826 • Published Jan 23 • 25

yuexiang96

authored 6 papers 4 months ago

JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation

Paper • 2410.17250 • Published Oct 22, 2024 • 15

yuexiang96

authored a paper 5 months ago

Teach Multimodal LLMs to Comprehend Electrocardiographic Images

Paper • 2410.19008 • Published Oct 21, 2024 • 24

yuki-imajuku

authored a paper 5 months ago

JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation

Paper • 2410.17250 • Published Oct 22, 2024 • 15

yuki-imajuku

updated a dataset 5 months ago

JMMMU/JMMMU

Viewer • Updated Oct 25, 2024 • 1.32k • 1.02k • 16

ku21fan

authored a paper 5 months ago

JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation

Paper • 2410.17250 • Published Oct 22, 2024 • 15

shtapm

authored a paper 5 months ago

JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation

Paper • 2410.17250 • Published Oct 22, 2024 • 15

yuexiang96

posted an update 5 months ago

Post

3102

🌍 I’ve always had a dream of making AI accessible to everyone, regardless of location or language. However, current open MLLMs often respond in English, even to non-English queries!

🚀 Introducing Pangea: A Fully Open Multilingual Multimodal LLM supporting 39 languages! 🌐✨

https://neulab.github.io/Pangea/
https://arxiv.org/pdf/2410.16153

The Pangea family includes three major components:
🔥 Pangea-7B: A state-of-the-art multilingual multimodal LLM capable of 39 languages! Not only does it excel in multilingual scenarios, but it also matches or surpasses English-centric models like Llama 3.2, Molmo, and LlavaOneVision in English performance.

📝 PangeaIns: A 6M multilingual multimodal instruction tuning dataset across 39 languages. 🗂️ With 40% English instructions and 60% multilingual instructions, it spans various domains, including 1M culturally-relevant images sourced from LAION-Multi. 🎨

🏆 PangeaBench: A comprehensive evaluation benchmark featuring 14 datasets in 47 languages. Evaluation can be tricky, so we carefully curated existing benchmarks and introduced two new datasets: xChatBench (human-annotated wild queries with fine-grained evaluation criteria) and xMMMU (a meticulously machine-translated version of MMMU).

Check out more details: https://x.com/xiangyue96/status/1848753709787795679

yuexiang96

authored 3 papers 5 months ago

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published Oct 21, 2024 • 44

MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures

Paper • 2410.13754 • Published Oct 17, 2024 • 75

KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks

Paper • 2410.06526 • Published Oct 9, 2024 • 1

AI & ML interests

Recent Activity

Team members 6

JMMMU's activity

Typo in validation_Agriculture_1

JMMMU Leaderboard