WildVision Team

non-profit

WildVision-Bench

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

yuchenlin authored a paper 4 days ago

Small Models Struggle to Learn from Strong Reasoners

DongfuJiang authored a paper 19 days ago

ACECODER: Acing Coder RL via Automated Test-Case Synthesis

yuchenlin authored a paper 20 days ago

ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning

View all activity

WildVision's activity

yuchenlin

authored a paper 4 days ago

Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published 7 days ago • 26

DongfuJiang

authored a paper 19 days ago

ACECODER: Acing Coder RL via Automated Test-Case Synthesis

Paper • 2502.01718 • Published 21 days ago • 28

yuchenlin

authored a paper 20 days ago

ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning

Paper • 2502.01100 • Published 21 days ago • 15

FunCube

updated a Space 2 months ago

542

Vision Arena (Testing VLMs side-by-side)

🖼

Analyze images to detect and label objects

yuchenlin

authored a paper 4 months ago

On Memorization of Large Language Models in Logical Reasoning

Paper • 2410.23123 • Published Oct 30, 2024 • 18

FunCube

updated a Space 4 months ago

README

🌍

NeurIPS'24 Datasets&Benchmarks

DongfuJiang

authored a paper 4 months ago

MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks

Paper • 2410.10563 • Published Oct 14, 2024 • 39

DongfuJiang

updated a Space 5 months ago

542

Vision Arena (Testing VLMs side-by-side)

🖼

Analyze images to detect and label objects

DongfuJiang

updated a dataset 5 months ago

WildVision/wildvision-bench

Viewer • Updated Oct 4, 2024 • 1k • 209 • 8

DongfuJiang

updated 3 datasets 6 months ago

saxon

authored 7 papers 8 months ago

Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies

Paper • 2308.03188 • Published Aug 6, 2023 • 2

Let's Think Frame by Frame: Evaluating Video Chain of Thought with Video Infilling and Prediction

Paper • 2305.13903 • Published May 23, 2023

Visual Chain of Thought: Bridging Logical Gaps with Multimodal Infillings

Paper • 2305.02317 • Published May 3, 2023

WikiWhy: Answering and Explaining Cause-and-Effect Questions

Paper • 2210.12152 • Published Oct 21, 2022 • 1

Not All Errors are Equal: Learning Text Generation Metrics using Stratified Error Synthesis

Paper • 2210.05035 • Published Oct 10, 2022

TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and Image-to-Video Generation

Paper • 2406.08656 • Published Jun 12, 2024 • 8

Losing Visual Needles in Image Haystacks: Vision Language Models are Easily Distracted in Short and Long Contexts

Paper • 2406.16851 • Published Jun 24, 2024

FunCube

updated a dataset 8 months ago

WildVision/wildvision-arena-data

Viewer • Updated Jul 1, 2024 • 10.8k • 321 • 6

AI & ML interests

Recent Activity

Team members 5

WildVision's activity

Vision Arena (Testing VLMs side-by-side)

README

Vision Arena (Testing VLMs side-by-side)