fdsqefsgergd's picture

2588 165

fdsqefsgergd

T-representer

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 11 hours ago

OminiControl: Minimal and Universal Control for Diffusion Transformer

upvoted a paper about 11 hours ago

BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games

upvoted a paper about 11 hours ago

VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement

View all activity

Organizations

None yet

T-representer's activity

upvoted 3 papers about 11 hours ago

OminiControl: Minimal and Universal Control for Diffusion Transformer

Paper • 2411.15098 • Published 3 days ago • 21

BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games

Paper • 2411.13543 • Published 5 days ago • 13

VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement

Paper • 2411.15115 • Published 3 days ago • 5

upvoted 6 papers about 15 hours ago

Style-Friendly SNR Sampler for Style-Driven Generation

Paper • 2411.14793 • Published 4 days ago • 28

TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

Paper • 2411.15124 • Published 3 days ago • 38

Large Multi-modal Models Can Interpret Features in Large Multi-modal Models

Paper • 2411.14982 • Published 3 days ago • 11

VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection

Paper • 2411.14794 • Published 4 days ago • 9

Efficient Long Video Tokenization via Coordinated-based Patch Reconstruction

Paper • 2411.14762 • Published 4 days ago • 9

Novel View Extrapolation with Video Diffusion Priors

Paper • 2411.14208 • Published 4 days ago • 6

upvoted 2 papers about 16 hours ago

MyTimeMachine: Personalized Facial Age Transformation

Paper • 2411.14521 • Published 4 days ago • 7

One to rule them all: natural language to bind communication, perception and action

Paper • 2411.15033 • Published 3 days ago • 2

upvoted a paper 2 days ago

Hymba: A Hybrid-head Architecture for Small Language Models

Paper • 2411.13676 • Published 5 days ago • 34

upvoted a paper 3 days ago

DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding

Paper • 2411.14347 • Published 4 days ago • 8

upvoted 7 papers 4 days ago

Multimodal Autoregressive Pre-training of Large Vision Encoders

Paper • 2411.14402 • Published 4 days ago • 36

Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions

Paper • 2411.14405 • Published 4 days ago • 46

Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization

Paper • 2411.10442 • Published 10 days ago • 57

OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs

Paper • 2411.14199 • Published 4 days ago • 23

Natural Language Reinforcement Learning

Paper • 2411.14251 • Published 4 days ago • 23

Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

Paper • 2411.14432 • Published 4 days ago • 18

Stable Flow: Vital Layers for Training-Free Image Editing

Paper • 2411.14430 • Published 4 days ago • 11