Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2401.02957

Vision Transformer

Denoising Vision Transformers

Paper • 2401.02957 • Published Jan 5 • 26

Denoising Vision Transformers

Paper • 2401.02957 • Published Jan 5 • 26

GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation

Paper • 2312.04557 • Published Dec 7, 2023 • 12
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models

Paper • 2312.04410 • Published Dec 7, 2023 • 14
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding

Paper • 2312.04461 • Published Dec 7, 2023 • 49
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively

Paper • 2401.02955 • Published Jan 5 • 16

Understanding LLMs: A Comprehensive Overview from Training to Inference

Paper • 2401.02038 • Published Jan 4 • 59
Denoising Vision Transformers

Paper • 2401.02957 • Published Jan 5 • 26
H2O-Danube-1.8B Technical Report

Paper • 2401.16818 • Published Jan 30 • 16

Saved trending papers

Learning Vision from Models Rivals Learning Vision from Data

Paper • 2312.17742 • Published Dec 28, 2023 • 12
Unsupervised Universal Image Segmentation

Paper • 2312.17243 • Published Dec 28, 2023 • 18
Perspectives on the State and Future of Deep Learning -- 2023

Paper • 2312.09323 • Published Dec 7, 2023 • 5
Vision-Language Models as a Source of Rewards

Paper • 2312.09187 • Published Dec 14, 2023 • 10

Computer vision

Unsupervised Universal Image Segmentation

Paper • 2312.17243 • Published Dec 28, 2023 • 18
Denoising Vision Transformers

Paper • 2401.02957 • Published Jan 5 • 26
timm/ViT-B-16-SigLIP

Zero-Shot Image Classification • Updated Oct 25, 2023 • 29.4k • 26
Running on Zero

16

🌖

Slimsam

Small yet powerful mask generation application ⚡️

Transformers & MoE

SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention

Paper • 2312.07987 • Published Dec 13, 2023 • 39
Interfacing Foundation Models' Embeddings

Paper • 2312.07532 • Published Dec 12, 2023 • 10
Point Transformer V3: Simpler, Faster, Stronger

Paper • 2312.10035 • Published Dec 15, 2023 • 17
TheBloke/quantum-v0.01-GPTQ

Text Generation • Updated Dec 18, 2023 • 2 • 2

Language Models

Exponentially Faster Language Modelling

Paper • 2311.10770 • Published Nov 15, 2023 • 117
stabilityai/stable-video-diffusion-img2vid-xt

Image-to-Video • Updated Apr 29 • 127k • 2.27k
LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes

Paper • 2311.13384 • Published Nov 22, 2023 • 48
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis

Paper • 2311.12454 • Published Nov 21, 2023 • 27

interesting paper

Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization

Paper • 2311.06243 • Published Nov 10, 2023 • 17
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores

Paper • 2311.05908 • Published Nov 10, 2023 • 11
PolyMaX: General Dense Prediction with Mask Transformer

Paper • 2311.05770 • Published Nov 9, 2023 • 6
SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models

Paper • 2311.07575 • Published Nov 13, 2023 • 10

any size diffusion

Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size HD Images

Paper • 2308.16582 • Published Aug 31, 2023 • 10
DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture Propagation

Paper • 2310.13119 • Published Oct 19, 2023 • 10
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior

Paper • 2310.16818 • Published Oct 25, 2023 • 27
Text-to-3D with classifier score distillation

Paper • 2310.19415 • Published Oct 30, 2023 • 4

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs