Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2404.14507

Papers - Nvidia

LITA: Language Instructed Temporal-Localization Assistant

Paper • 2403.19046 • Published Mar 27 • 17
Snap-it, Tap-it, Splat-it: Tactile-Informed 3D Gaussian Splatting for Reconstructing Challenging Surfaces

Paper • 2403.20275 • Published Mar 29 • 8
Condition-Aware Neural Network for Controlled Image Generation

Paper • 2404.01143 • Published Apr 1 • 11
CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues

Paper • 2404.03820 • Published Apr 4 • 24

Papers - Video - Synthetic Data Generator

MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies

Paper • 2403.01422 • Published Mar 3 • 26
VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding

Paper • 2403.09530 • Published Mar 14 • 8
VidToMe: Video Token Merging for Zero-Shot Video Editing

Paper • 2312.10656 • Published Dec 17, 2023 • 9
TC4D: Trajectory-Conditioned Text-to-4D Generation

Paper • 2403.17920 • Published Mar 26 • 15

Video as the New Language for Real-World Decision Making

Paper • 2402.17139 • Published Feb 27 • 18
VideoCrafter1: Open Diffusion Models for High-Quality Video Generation

Paper • 2310.19512 • Published Oct 30, 2023 • 15
VideoMamba: State Space Model for Efficient Video Understanding

Paper • 2403.06977 • Published Mar 11 • 27
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

Paper • 2401.09047 • Published Jan 17 • 13

Foundation AI Papers

Curated List of Must-Reads on LLM reasoning at Temus AI team

Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

Paper • 2310.04406 • Published Oct 6, 2023 • 8
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 99
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization

Paper • 2402.09320 • Published Feb 14 • 6
Self-Discover: Large Language Models Self-Compose Reasoning Structures

Paper • 2402.03620 • Published Feb 6 • 109

Training-Free Consistent Text-to-Image Generation

Paper • 2402.03286 • Published Feb 5 • 64
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation

Paper • 2402.04324 • Published Feb 6 • 23
λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space

Paper • 2402.05195 • Published Feb 7 • 18
FiT: Flexible Vision Transformer for Diffusion Model

Paper • 2402.12376 • Published Feb 19 • 48

PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models

Paper • 2402.08714 • Published Feb 13 • 10
Data Engineering for Scaling Language Models to 128K Context

Paper • 2402.10171 • Published Feb 15 • 21
RLVF: Learning from Verbal Feedback without Overgeneralization

Paper • 2402.10893 • Published Feb 16 • 10
Coercing LLMs to do and reveal (almost) anything

Paper • 2402.14020 • Published Feb 21 • 12

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Paper • 2401.09048 • Published Jan 17 • 8
Improving fine-grained understanding in image-text pre-training

Paper • 2401.09865 • Published Jan 18 • 15
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19 • 58
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Paper • 2401.13627 • Published Jan 24 • 71

Optimizing diffusion models

Provides a list of papers focusing on optimizing T2I diffusion models, targeting fewer timesteps, architecture optimization, and more.

Progressive Distillation for Fast Sampling of Diffusion Models

Paper • 2202.00512 • Published Feb 1, 2022 • 1
On Distillation of Guided Diffusion Models

Paper • 2210.03142 • Published Oct 6, 2022
InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation

Paper • 2309.06380 • Published Sep 12, 2023 • 32
Consistency Models

Paper • 2303.01469 • Published Mar 2, 2023 • 6

Diffusion Models

Instruct-Imagen: Image Generation with Multi-modal Instruction

Paper • 2401.01952 • Published Jan 3 • 30
ODIN: A Single Model for 2D and 3D Perception

Paper • 2401.02416 • Published Jan 4 • 11
Bigger is not Always Better: Scaling Properties of Latent Diffusion Models

Paper • 2404.01367 • Published Apr 1 • 19
Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models

Paper • 2404.02747 • Published Apr 3 • 11

Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models

Paper • 2312.09608 • Published Dec 15, 2023 • 13
CodeFusion: A Pre-trained Diffusion Model for Code Generation

Paper • 2310.17680 • Published Oct 26, 2023 • 69
ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Real Image

Paper • 2310.17994 • Published Oct 27, 2023 • 8
Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss

Paper • 2401.02677 • Published Jan 5 • 21

Previous
1
2
3
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs