new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

by AK and the research community

May 19

Submitted by

akhaliq

Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold

·
6 authors

Submitted by

akhaliq

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

·
7 authors

Submitted by

akhaliq

LDM3D: Latent Diffusion Model for 3D

·
11 authors

Submitted by

akhaliq

OpenShape: Scaling Up 3D Shape Representation Towards Open-World Understanding

·
9 authors

Submitted by

akhaliq

SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities

·
7 authors

Submitted by

akhaliq

CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-training

·
8 authors

Submitted by

akhaliq

VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks

·
11 authors

Submitted by

akhaliq

UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild

·
13 authors

Submitted by

akhaliq

TextDiffuser: Diffusion Models as Text Painters

·
6 authors

Submitted by

akhaliq

Discriminative Diffusion Models as Few-shot Vision and Language Learners

·
9 authors

Submitted by

akhaliq

Going Denser with Open-Vocabulary Part Segmentation

·
7 authors

Submitted by

akhaliq

mLongT5: A Multilingual and Efficient Text-To-Text Transformer for Longer Sequences

·
4 authors

Submitted by

akhaliq

TrueTeacher: Learning Factual Consistency Evaluation with Large Language Models

·
5 authors

Submitted by

akhaliq

Learning the Visualness of Text Using Large Vision-Language Models

·
5 authors

Submitted by

akhaliq

GETMusic: Generating Any Music Tracks with a Unified Representation and Diffusion Framework

·
7 authors

Submitted by

akhaliq

A Generalist Dynamics Model for Control

·
10 authors

Submitted by

akhaliq

MolXPT: Wrapping Molecules with Text for Generative Pre-training

·
8 authors

Submitted by

akhaliq

VideoFactory: Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation

·
7 authors

Submitted by

akhaliq

Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models

·
10 authors