Joya Chen's picture

Joya Chen PRO

chenjoya

·

https://chenjoya.github.io/

chenjoya

AI & ML interests

Streaming Video LLM

Recent Activity

updated a model about 4 hours ago

chenjoya/LiveCC-7B-Base

updated a model about 4 hours ago

chenjoya/LiveCC-7B-Instruct

updated a dataset about 5 hours ago

chenjoya/Live-WhisperX-526K

View all activity

Organizations

chenjoya's activity

upvoted 2 papers 1 day ago

Describe Anything: Detailed Localized Image and Video Captioning

Paper • 2504.16072 • Published 2 days ago • 46

LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale

Paper • 2504.16030 • Published 2 days ago • 22

upvoted a collection 1 day ago

LiveCC

Learning Video LLM with Streaming Speech Transcription at Scale (CVPR 2025) • 8 items • Updated 1 day ago • 3

upvoted a paper 30 days ago

Long-Context Autoregressive Video Modeling with Next-Frame Prediction

Paper • 2503.19325 • Published about 1 month ago • 72

upvoted 4 papers about 1 month ago

Impossible Videos

Paper • 2503.14378 • Published Mar 18 • 61

VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning

Paper • 2503.13444 • Published Mar 17 • 15

TPDiff: Temporal Pyramid Video Diffusion Model

Paper • 2503.09566 • Published Mar 12 • 45

Automated Movie Generation via Multi-Agent CoT Planning

Paper • 2503.07314 • Published Mar 10 • 45

upvoted 3 papers about 2 months ago

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

Paper • 2503.03651 • Published Mar 5 • 16

Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models

Paper • 2503.01774 • Published Mar 3 • 44

PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data

Paper • 2502.14397 • Published Feb 20 • 42

upvoted 2 papers 2 months ago

WorldGUI: Dynamic Testing for Comprehensive Desktop GUI Automation

Paper • 2502.08047 • Published Feb 12 • 27

TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation

Paper • 2502.07870 • Published Feb 11 • 44

upvoted a paper 3 months ago

MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation

Paper • 2502.01572 • Published Feb 3 • 20

upvoted 6 papers 4 months ago

Are Your LLMs Capable of Stable Reasoning?

Paper • 2412.13147 • Published Dec 17, 2024 • 95

No More Adam: Learning Rate Scaling at Initialization is All You Need

Paper • 2412.11768 • Published Dec 16, 2024 • 44

Progressive Multimodal Reasoning via Active Retrieval

Paper • 2412.14835 • Published Dec 19, 2024 • 74

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 365

Parallelized Autoregressive Visual Generation

Paper • 2412.15119 • Published Dec 19, 2024 • 54

Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 39