gary109
's Collections
Application
updated
NExT-GPT: Any-to-Any Multimodal LLM
Paper
•
2309.05519
•
Published
•
74
Large Language Model for Science: A Study on P vs. NP
Paper
•
2309.05689
•
Published
•
20
AstroLLaMA: Towards Specialized Foundation Models in Astronomy
Paper
•
2309.06126
•
Published
•
16
Large Language Models for Compiler Optimization
Paper
•
2309.07062
•
Published
•
22
GaussianDreamer: Fast Generation from Text to 3D Gaussian Splatting with
Point Cloud Priors
Paper
•
2310.08529
•
Published
•
16
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Paper
•
2310.00704
•
Published
•
16
Ghost in the Minecraft: Generally Capable Agents for Open-World
Enviroments via Large Language Models with Text-based Knowledge and Memory
Paper
•
2305.17144
•
Published
•
2
Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Paper
•
2305.10601
•
Published
•
7
UI Layout Generation with LLMs Guided by UI Grammar
Paper
•
2310.15455
•
Published
•
2
Controlled Decoding from Language Models
Paper
•
2310.17022
•
Published
•
12
ControlLLM: Augment Language Models with Tools by Searching on Graphs
Paper
•
2310.17796
•
Published
•
15
Multimodal ChatGPT for Medical Applications: an Experimental Study of
GPT-4V
Paper
•
2310.19061
•
Published
•
8
VideoCrafter1: Open Diffusion Models for High-Quality Video Generation
Paper
•
2310.19512
•
Published
•
14
TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language
Modeling Likewise
Paper
•
2310.19019
•
Published
•
9
ChipNeMo: Domain-Adapted LLMs for Chip Design
Paper
•
2311.00176
•
Published
•
7
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation,
Generation and Editing
Paper
•
2311.00571
•
Published
•
39
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo
Labelling
Paper
•
2311.00430
•
Published
•
53
Controllable Music Production with Diffusion Models and Guidance
Gradients
Paper
•
2311.00613
•
Published
•
23
MSTRE-Net: Multistreaming Acoustic Modeling for Automatic Lyrics
Transcription
Paper
•
2108.02625
•
Published
•
1
RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning
via Generative Simulation
Paper
•
2311.01455
•
Published
•
25
FLAP: Fast Language-Audio Pre-training
Paper
•
2311.01615
•
Published
•
16
PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task
Completion
Paper
•
2311.01767
•
Published
•
16
Fast View Synthesis of Casual Videos
Paper
•
2312.02135
•
Published
•
8
Paper
•
2312.02149
•
Published
•
4
StemGen: A music generation model that listens
Paper
•
2312.08723
•
Published
•
45
Proactive Detection of Voice Cloning with Localized Watermarking
Paper
•
2401.17264
•
Published
•
15
PokéLLMon: A Human-Parity Agent for Pokémon Battles with Large
Language Models
Paper
•
2402.01118
•
Published
•
28
K-Level Reasoning with Large Language Models
Paper
•
2402.01521
•
Published
•
16
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and
Dialogue Abilities
Paper
•
2402.01831
•
Published
•
11
TinyLlama: An Open-Source Small Language Model
Paper
•
2401.02385
•
Published
•
80
Magic-Me: Identity-Specific Video Customized Diffusion
Paper
•
2402.09368
•
Published
•
24
Zero-Shot Unsupervised and Text-Based Audio Editing Using DDPM Inversion
Paper
•
2402.10009
•
Published
•
18
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Paper
•
2312.09911
•
Published
•
50
RLVF: Learning from Verbal Feedback without Overgeneralization
Paper
•
2402.10893
•
Published
•
10
Learning to Learn Faster from Human Feedback with Language Model
Predictive Control
Paper
•
2402.11450
•
Published
•
20
AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls
Paper
•
2402.04253
•
Published
Personalized Audiobook Recommendations at Spotify Through Graph Neural
Networks
Paper
•
2403.05185
•
Published
•
19
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
Paper
•
2403.17694
•
Published
•
10
m-a-p/ChatMusician
Text Generation
•
Updated
•
926
•
101
Audio Dialogues: Dialogues dataset for audio and music understanding
Paper
•
2404.07616
•
Published
•
14