new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

by AK and the research community

Jul 11

Submitted by

akhaliq

PaliGemma: A versatile 3B VLM for transfer

·
35 authors

Submitted by

akhaliq

Inference Performance Optimization for Large Language Models on CPUs

·
10 authors

Submitted by

akhaliq

LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models

·
8 authors

Submitted by

akhaliq

Controlling Space and Time with Diffusion Models

·
5 authors

Submitted by

akhaliq

Video-to-Audio Generation with Hidden Alignment

·
7 authors

Submitted by

akhaliq

VEnhancer: Generative Space-Time Enhancement for Video Generation

·
9 authors

Submitted by

akhaliq

Still-Moving: Customized Video Generation without Customized Video Data

·
10 authors

Submitted by

akhaliq

Do Vision and Language Models Share Concepts? A Vector Space Alignment Study

·
4 authors

Submitted by

davanstrien

CosmoCLIP: Generalizing Large Vision-Language Models for Astronomical Imaging

·
5 authors

Submitted by

davanstrien

On Leakage of Code Generation Evaluation Datasets

·
10 authors

Submitted by

HikariDawn

This&That: Language-Gesture Controlled Video Generation for Robot Planning

·
7 authors

Submitted by

PAlbert31

An accurate detection is not all you need to combat label noise in web-noisy datasets

·
6 authors

Submitted by

gxyes

CrowdMoGen: Zero-Shot Text-Driven Collective Motion Generation

·
5 authors

Submitted by

akhaliq

BiGym: A Demo-Driven Mobile Bi-Manual Manipulation Benchmark

·
6 authors