knightnemo

AI & ML interests

None yet

Recent Activity

upvoted a paper about 15 hours ago

CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning

upvoted a paper 5 days ago

WORLDMEM: Long-term Consistent World Simulation with Memory

upvoted a paper 22 days ago

Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing

View all activity

Organizations

None yet

knightnemo's activity

upvoted a paper about 15 hours ago

CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning

Paper • 2504.13820 • Published 6 days ago • 15

upvoted a paper 5 days ago

WORLDMEM: Long-term Consistent World Simulation with Memory

Paper • 2504.12369 • Published 8 days ago • 30

upvoted 3 papers 22 days ago

Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing

Paper • 2503.19385 • Published about 1 month ago • 33

Video-T1: Test-Time Scaling for Video Generation

Paper • 2503.18942 • Published about 1 month ago • 88

PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from Videos

Paper • 2503.17973 • Published Mar 23 • 7

upvoted 7 papers about 1 month ago

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published Mar 12 • 71

World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning

Paper • 2503.10480 • Published Mar 13 • 52

upvoted 3 papers about 2 months ago

Unified Video Action Model

Paper • 2503.00200 • Published Feb 28 • 14

Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think

Paper • 2502.20172 • Published Feb 27 • 28

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published Mar 3 • 78

upvoted a collection about 2 months ago

Qwen2.5-VL

Collection

Vision-language model series based on Qwen2.5 • 11 items • Updated 24 days ago • 448

upvoted 2 papers about 2 months ago

RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers

Paper • 2502.15894 • Published Feb 21 • 20

Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation

Paper • 2502.16707 • Published Feb 23 • 13

liked a Space 2 months ago

282

DynamiCrafter

🐨

Generate videos from images and text prompts

upvoted a paper 2 months ago

SURGE: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors

Paper • 2502.11167 • Published Feb 16 • 10