15 12 11

Alex Jinpeng Wang

Awiny

https://fingerrec.github.io

FingerRec

AI & ML interests

Multi-Modality Pre-training, Data-Centric AI, Video Self-supervised Learning

Recent Activity

authored a paper 7 days ago

V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models

upvoted a paper 8 days ago

V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models

commented on a paper 8 days ago

V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models

View all activity

Organizations

Awiny's activity

authored a paper 7 days ago

V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models

Paper • 2504.06148 • Published 8 days ago • 12

upvoted a paper 8 days ago

V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models

Paper • 2504.06148 • Published 8 days ago • 12

commented a paper 8 days ago

V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models

Paper • 2504.06148 • Published 8 days ago • 12 •

authored a paper 21 days ago

Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models

Paper • 2503.20198 • Published 22 days ago • 4

upvoted a paper 21 days ago

Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models

Paper • 2503.20198 • Published 22 days ago • 4

commented 2 papers 21 days ago

Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models

Paper • 2503.20198 • Published 22 days ago • 4 •

BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation

Paper • 2503.20672 • Published 21 days ago • 13 •

upvoted a paper 29 days ago

Impossible Videos

Paper • 2503.14378 • Published 29 days ago • 59

upvoted 4 papers about 1 month ago

TPDiff: Temporal Pyramid Video Diffusion Model

Paper • 2503.09566 • Published Mar 12 • 44

Automated Movie Generation via Multi-Agent CoT Planning

Paper • 2503.07314 • Published Mar 10 • 43

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

Paper • 2503.03651 • Published Mar 5 • 16

Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models

Paper • 2503.01774 • Published Mar 3 • 43

upvoted a paper about 2 months ago

PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data

Paper • 2502.14397 • Published Feb 20 • 41

updated a Space about 2 months ago

README

📈

published a Space about 2 months ago

README

📈

upvoted a paper 2 months ago

WorldGUI: Dynamic Testing for Comprehensive Desktop GUI Automation

Paper • 2502.08047 • Published Feb 12 • 27

liked a dataset 2 months ago

CSU-JPG/TextAtlas5M

Viewer • Updated Feb 21 • 5.35M • 5.62k • 21

commented a paper 2 months ago

TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation

Paper • 2502.07870 • Published Feb 11 • 44 •

upvoted a paper 2 months ago

TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation

Paper • 2502.07870 • Published Feb 11 • 44

liked a dataset 2 months ago

CSU-JPG/TextAtlasEval

Viewer • Updated Feb 23 • 3k • 153 • 8