new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Apr 18

Submitted by

shizhediao

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

·
15 authors

2

Submitted by

schwarzschild

Antidistillation Sampling

·
7 authors

4

Submitted by

davidchan

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

·
6 authors

Submitted by

BestWishYsh

Packing Input Frame Context in Next-Frame Prediction Models for Video Generation

·
2 authors

Submitted by

zeqixiao

WORLDMEM: Long-term Consistent World Simulation with Memory

·
7 authors

Submitted by

QizhiPei

A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis

·
8 authors

Submitted by

nielsr

Perception Encoder: The best visual embeddings are not at the output of the network

·
18 authors

Submitted by

ahmed-masry

ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering

·
14 authors

2

Submitted by

Harold328

VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models

·
8 authors

4

Submitted by

dreamerdeo

NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation

·
8 authors

2

Submitted by

sthuihui

DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging

·
7 authors

3

Submitted by

janghyuncho7

PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding

·
29 authors

Submitted by

wanghaofan

InstantCharacter: Personalize Any Characters with a Scalable Diffusion Transformer Framework

·
12 authors

2

Submitted by

zhoutianyi

Exploring Expert Failures Improves LLM Agent Tuning

·
6 authors

4

Submitted by

LeanQuant

70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float

·
6 authors

2

Submitted by

TechxGenus

Sleep-time Compute: Beyond Inference Scaling at Test-time

·
7 authors

Submitted by

dongyong2

CCMNet: Leveraging Calibrated Color Correction Matrices for Cross-Camera Color Constancy

·
5 authors

2

Submitted by

yiren98

FocusedAD: Character-centric Movie Audio Description

·
6 authors

Submitted by

HanNight

Retrieval-Augmented Generation with Conflicting Evidence

·
4 authors

2

Submitted by

cihangxie

Complex-Edit: CoT-Like Instruction Generation for Complexity-Controllable Image Editing Benchmark

·
6 authors

Submitted by

Shilin-LU

Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts

·
4 authors

2

Submitted by

hriaz

MetaSynth: Meta-Prompting-Driven Agentic Scaffolds for Diverse Synthetic Data Generation

·
5 authors

2

Submitted by

WY123L

Learning Occlusion-Robust Vision Transformers for Real-Time UAV Tracking

·
7 authors

2