2 37 5

Lewei Lu

luotto

ottolu

AI & ML interests

None yet

Recent Activity

liked a dataset 2 days ago

nvidia/describe-anything-dataset

liked a dataset 2 days ago

future-technologies/Universal-Transformers-Dataset

authored a paper 3 days ago

VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models

View all activity

Organizations

luotto's activity

liked 2 datasets 2 days ago

nvidia/describe-anything-dataset

Viewer • Updated 2 days ago • 916k • 2.73k • 19

future-technologies/Universal-Transformers-Dataset

Viewer • Updated 1 minute ago • 168M • 5.1k • 87

authored a paper 3 days ago

VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models

Paper • 2504.15279 • Published 6 days ago • 63

upvoted a paper 3 days ago

VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models

Paper • 2504.15279 • Published 6 days ago • 63

authored a paper 12 days ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published 13 days ago • 244

upvoted a paper 12 days ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published 13 days ago • 244

upvoted 2 papers 17 days ago

Inference-Time Scaling for Generalist Reward Modeling

Paper • 2504.02495 • Published 24 days ago • 54

Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing

Paper • 2504.02826 • Published 24 days ago • 67

upvoted a paper 18 days ago

OmniSVG: A Unified Scalable Vector Graphics Generation Model

Paper • 2504.06263 • Published 19 days ago • 155

liked a dataset 26 days ago

MrDragonFox/Elise

Viewer • Updated about 1 month ago • 1.2k • 2.91k • 30

upvoted 2 papers about 1 month ago

InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks

Paper • 2312.14238 • Published Dec 21, 2023 • 21

Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy

Paper • 2503.19757 • Published Mar 25 • 50

upvoted a collection about 1 month ago

InternLM3

Collection

6 items • Updated Feb 11 • 25

upvoted 3 papers about 1 month ago

Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM

Paper • 2503.14478 • Published Mar 18 • 47

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16 • 155

VisualPRM: An Effective Process Reward Model for Multimodal Reasoning

Paper • 2503.10291 • Published Mar 13 • 36

authored a paper about 1 month ago

VisualPRM: An Effective Process Reward Model for Multimodal Reasoning

Paper • 2503.10291 • Published Mar 13 • 36

upvoted a paper about 1 month ago

GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Paper • 2503.10639 • Published Mar 13 • 50

upvoted a paper 2 months ago

CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction

Paper • 2502.07316 • Published Feb 11 • 49

authored a paper 2 months ago

MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction

Paper • 2502.11663 • Published Feb 17 • 39