GuoLiangTang's picture

3174 3

GuoLiangTang

Tommy930

·

https://github.com/TommyTang930

AI & ML interests

LLM，NLP，ML

Recent Activity

upvoted a paper about 21 hours ago

TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of Tools

upvoted a paper about 21 hours ago

GoalFlow: Goal-Driven Flow Matching for Multimodal Trajectories Generation in End-to-End Autonomous Driving

upvoted a paper about 21 hours ago

Kolmogorov-Arnold Attention: Is Learnable Attention Better For Vision Transformers?

View all activity

Organizations

None yet

Tommy930's activity

upvoted 9 papers about 21 hours ago

TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of Tools

Paper • 2503.10970 • Published 4 days ago • 10

GoalFlow: Goal-Driven Flow Matching for Multimodal Trajectories Generation in End-to-End Autonomous Driving

Paper • 2503.05689 • Published 10 days ago • 2

Kolmogorov-Arnold Attention: Is Learnable Attention Better For Vision Transformers?

Paper • 2503.10632 • Published 4 days ago • 8

Cockatiel: Ensembling Synthetic and Human Preferenced Training for Detailed Video Caption

Paper • 2503.09279 • Published 6 days ago • 5

FlowTok: Flowing Seamlessly Across Text and Image Tokens

Paper • 2503.10772 • Published 4 days ago • 12

API Agents vs. GUI Agents: Divergence and Convergence

Paper • 2503.11069 • Published 4 days ago • 20

Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models

Paper • 2503.11224 • Published 4 days ago • 21

ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

Paper • 2503.11647 • Published 3 days ago • 81

PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity

Paper • 2503.07677 • Published 8 days ago • 66

upvoted 11 papers 1 day ago

BIMBA: Selective-Scan Compression for Long-Range Video Question Answering

Paper • 2503.09590 • Published 5 days ago • 3

MoC: Mixtures of Text Chunking Learners for Retrieval-Augmented Generation System

Paper • 2503.09600 • Published 5 days ago • 4

Alias-Free Latent Diffusion Models:Improving Fractional Shift Equivariance of Diffusion Latent Space

Paper • 2503.09419 • Published 5 days ago • 5

Cost-Optimal Grouped-Query Attention for Long-Context LLMs

Paper • 2503.09579 • Published 5 days ago • 5

When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning

Paper • 2503.07588 • Published 7 days ago • 7

More Documents, Same Length: Isolating the Challenge of Multiple Documents in RAG

Paper • 2503.04388 • Published 11 days ago • 15

Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published 5 days ago • 23

Reangle-A-Video: 4D Video Generation as Video-to-Video Translation

Paper • 2503.09151 • Published 6 days ago • 29

TPDiff: Temporal Pyramid Video Diffusion Model

Paper • 2503.09566 • Published 5 days ago • 41

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published 5 days ago • 54

PerCoV2: Improved Ultra-Low Bit-Rate Perceptual Image Compression with Implicit Hierarchical Masked Image Modeling

Paper • 2503.09368 • Published 5 days ago • 2