Sarah Thompson's picture

15 3

Sarah Thompson

crimsonFalcon91

·

AI & ML interests

None yet

Recent Activity

liked a model about 16 hours ago

answerdotai/ModernBERT-base

upvoted a paper about 16 hours ago

DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation

upvoted a paper 13 days ago

3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors

View all activity

Organizations

None yet

crimsonFalcon91's activity

upvoted a paper about 16 hours ago

DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation

Paper • 2412.18597 • Published 1 day ago • 10

upvoted 14 papers 13 days ago

3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors

Paper • 2410.16266 • Published Oct 21 • 4

EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary Search

Paper • 2410.14649 • Published Oct 18 • 8

Frontiers in Intelligent Colonoscopy

Paper • 2410.17241 • Published Oct 22 • 3

MiniPLM: Knowledge Distillation for Pre-Training Language Models

Paper • 2410.17215 • Published Oct 22 • 14

xGen-MM-Vid (BLIP-3-Video): You Only Need 32 Tokens to Represent a Video Even in VLMs

Paper • 2410.16267 • Published Oct 21 • 17

Mitigating Object Hallucination via Concentric Causal Attention

Paper • 2410.15926 • Published Oct 21 • 16

PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction

Paper • 2410.17247 • Published Oct 22 • 45

SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes

Paper • 2410.17249 • Published Oct 22 • 41

LLM-based Optimization of Compound AI Systems: A Survey

Paper • 2410.16392 • Published Oct 21 • 14

JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation

Paper • 2410.17250 • Published Oct 22 • 14

Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes

Paper • 2410.16930 • Published Oct 22 • 6

Improve Vision Language Model Chain-of-thought Reasoning

Paper • 2410.16198 • Published Oct 21 • 22

Phi-4 Technical Report

Paper • 2412.08905 • Published 14 days ago • 92

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published 13 days ago • 90