9 50 48

Pengxiang Li

pengxiang

pixeli

AI & ML interests

Video generation, Image editing, AD

Recent Activity

upvoted a paper about 7 hours ago

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

commented on a paper 6 days ago

Frac-Connections: Fractional Extension of Hyper-Connections

upvoted a paper 6 days ago

Frac-Connections: Fractional Extension of Hyper-Connections

View all activity

Organizations

None yet

pengxiang's activity

upvoted a paper about 7 hours ago

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published 12 days ago • 61

upvoted a paper 6 days ago

Frac-Connections: Fractional Extension of Hyper-Connections

Paper • 2503.14125 • Published 6 days ago • 19

upvoted a paper 18 days ago

HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization

Paper • 2503.04598 • Published 18 days ago • 18

upvoted a paper 19 days ago

Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs

Paper • 2503.01307 • Published 21 days ago • 33

upvoted 2 papers about 1 month ago

Large Language Diffusion Models

Paper • 2502.09992 • Published Feb 14 • 105

The Curse of Depth in Large Language Models

Paper • 2502.05795 • Published Feb 9 • 37

upvoted 6 papers 2 months ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 265

InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection

Paper • 2501.04575 • Published Jan 8 • 23

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8 • 91

upvoted 3 papers 3 months ago

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Paper • 2412.19723 • Published Dec 27, 2024 • 84

Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN

Paper • 2412.13795 • Published Dec 18, 2024 • 20

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 78

upvoted 4 papers 4 months ago

ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published Nov 26, 2024 • 84

DreamMix: Decoupling Object Attributes for Enhanced Editability in Customized Image Inpainting

Paper • 2411.17223 • Published Nov 26, 2024 • 7

MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control

Paper • 2411.13807 • Published Nov 21, 2024 • 11

EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation

Paper • 2411.08380 • Published Nov 13, 2024 • 26

upvoted a collection 4 months ago

🍃 MINT-1T

Collection

Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens" • 13 items • Updated Jul 24, 2024 • 58