new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

by AK and the research community

Jun 6

Submitted by

akhaliq

Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

·
3 authors

Submitted by

akhaliq

A Technical Report for Polyglot-Ko: Open-Source Large-Scale Korean Language Models

·
7 authors

Submitted by

akhaliq

Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias

·
12 authors

Submitted by

akhaliq

Deductive Verification of Chain-of-Thought Reasoning

·
7 authors

Submitted by

akhaliq

HeadSculpt: Crafting 3D Head Avatars with Text

·
8 authors

Submitted by

akhaliq

PolyVoice: Language Models for Speech to Speech Translation

·
17 authors

Submitted by

akhaliq

A Static Evaluation of Code Completion by Large Language Models

·
12 authors

Submitted by

akhaliq

MotionDiffuser: Controllable Multi-Agent Motion Prediction using Diffusion

·
6 authors

Submitted by

akhaliq

LEACE: Perfect linear concept erasure in closed form

·
6 authors

Submitted by

akhaliq

Natural Language Commanding via Program Synthesis

·
5 authors

Submitted by

akhaliq

Large Language Models of Code Fail at Completing Code with Potential Bugs

·
7 authors

Submitted by

akhaliq

Neuralangelo: High-Fidelity Neural Surface Reconstruction

·
7 authors

Submitted by

akhaliq

GPT Models Meet Robotic Applications: Co-Speech Gesturing Chat System

·
5 authors

Submitted by

akhaliq

Binary and Ternary Natural Language Generation

·
5 authors

Submitted by

akhaliq

The Surprising Effectiveness of Diffusion Models for Optical Flow and Monocular Depth Estimation

·
7 authors

Submitted by

akhaliq

SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model

·
7 authors

Submitted by

akhaliq

PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model

·
6 authors

Submitted by

akhaliq

VisualGPTScore: Visio-Linguistic Reasoning with Multimodal Generative Pre-Training Scores

·
5 authors

Submitted by

akhaliq

Probabilistic Adaptation of Text-to-Video Models

·
6 authors

Submitted by

akhaliq

Transformer-based Vulnerability Detection in Code at EditTime: Zero-shot, Few-shot, or Fine-tuning?

·
8 authors