Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2311.00430

Knowledge Distillation

shayekh/aya8b-distillkit-hidden

Updated 20 days ago • 1
shayekh/aya8b-distillkit-logits

Updated 20 days ago
AhmadMustafa/distAyaQwen

Updated 21 days ago • 2 • 1
Less is More: Task-aware Layer-wise Distillation for Language Model Compression

Paper • 2210.01351 • Published Oct 4, 2022 • 1

Speech to text.

Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling

Paper • 2311.00430 • Published Nov 1, 2023 • 56

whisper_related_papers

Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling

Paper • 2311.00430 • Published Nov 1, 2023 • 56
Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data

Paper • 2309.13876 • Published Sep 25, 2023 • 1
Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition

Paper • 2310.06434 • Published Oct 10, 2023 • 4

Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling

Paper • 2311.00430 • Published Nov 1, 2023 • 56

Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling

Paper • 2311.00430 • Published Nov 1, 2023 • 56

Masked Autoencoders Are Scalable Vision Learners

Paper • 2111.06377 • Published Nov 11, 2021 • 2
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling

Paper • 2311.00430 • Published Nov 1, 2023 • 56
distil-whisper/distil-large-v2

Automatic Speech Recognition • Updated Mar 21 • 63.9k • 498
Seven Failure Points When Engineering a Retrieval Augmented Generation System

Paper • 2401.05856 • Published Jan 11 • 2

Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Paper • 2312.03818 • Published Dec 6, 2023 • 31
Scaling Laws of Synthetic Images for Model Training ... for Now

Paper • 2312.04567 • Published Dec 7, 2023 • 7
Large Language Models for Mathematicians

Paper • 2312.04556 • Published Dec 7, 2023 • 11
LooseControl: Lifting ControlNet for Generalized Depth Conditioning

Paper • 2312.03079 • Published Dec 5, 2023 • 12

Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling

Paper • 2311.00430 • Published Nov 1, 2023 • 56
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

Paper • 2307.01952 • Published Jul 4, 2023 • 79
Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 82
Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models

Paper • 2311.00871 • Published Nov 1, 2023 • 2

This is a collection of some papers I've read in the past few months

FinGPT: Large Generative Models for a Small Language

Paper • 2311.05640 • Published Nov 3, 2023 • 27
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

Paper • 2311.05556 • Published Nov 9, 2023 • 79
Distributed Deep Learning in Open Collaborations

Paper • 2106.10207 • Published Jun 18, 2021 • 1
Datasets: A Community Library for Natural Language Processing

Paper • 2109.02846 • Published Sep 7, 2021 • 8

Adapting/Distilling

Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling

Paper • 2311.00430 • Published Nov 1, 2023 • 56

Previous
1
2
3
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs