LlamaForTokenClassification Collection Fine Tuned llama variants for Token Classification • 6 items • Updated 6 days ago • 2
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 11 items • Updated 2 days ago • 88
T5 release Collection The original T5 transformer release was done in two steps, the original T5 checkpoints and the improved T5v1 • 9 items • Updated 5 days ago • 10
Flan-T5 release Collection The Flan-T5 covers 4 checkpoints of different sizes each time. It also includes upgrades versions trained using Universal sampling • 7 items • Updated 5 days ago • 14
BERT release Collection Regroups the original BERT models released by the Google team. Except for the models marked otherwise, the checkpoints support English. • 8 items • Updated 5 days ago • 15
NB-Whisper-verbatim Collection NB-Whisper models that are mostly suited for linguists and researchers. The output is lowercase and without punctation. • 5 items • Updated Feb 13 • 1
view article Article Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints 18 days ago • 50
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Paper • 2405.00732 • Published 20 days ago • 107
Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published 19 days ago • 61
view article Article ⚗️ 🧑🏼🌾 Let's grow some Domain Specific Datasets together By burtenshaw • 20 days ago • 25
view article Article 🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets By dvilasuero • 23 days ago • 54
view article Article Post-OCR-Correction: 1 billion words dataset of automated OCR correction by LLM By Pclanglais • 23 days ago • 10
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training Paper • 2309.10400 • Published Sep 19, 2023 • 22
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published 26 days ago • 120
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published 27 days ago • 230
Llama 2 Family Collection This collection hosts the transformers and original repos of the Llama 2 and Llama Guard releases • 13 items • Updated about 1 month ago • 28
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated about 1 month ago • 523
view article Article Introducing the LiveCodeBench Leaderboard - Holistic and Contamination-Free Evaluation of Code LLMs Apr 16 • 11
Idefics2 🐶 Collection Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated 13 days ago • 76
distil-large-v3 Collection This collection contains the model repositories for distil-large-v3, which provides support for the most popular Whisper libraries. • 4 items • Updated Mar 21 • 4
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Paper • 2404.07143 • Published Apr 10 • 92
C4AI Command R Collection C4AI Command-R is a research release of a 35 billion parameter highly performant generative model. Command-R is a large language model with open weigh • 3 items • Updated Mar 28 • 9
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12 • 55
Zephyr ORPO Collection Models and datasets to align LLMs with Odds Ratio Preference Optimisation (ORPO). Recipes here: https://github.com/huggingface/alignment-handbook • 3 items • Updated Apr 12 • 14
DBRX Collection DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27 • 89
Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation Paper • 2403.12015 • Published Mar 18 • 60
State-of-the-art Danish Models Collection These models constitute state-of-the-art models for Danish within their respective domain (highlighted below the model). • 13 items • Updated Apr 11 • 9
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context Paper • 2403.05530 • Published Mar 8 • 50
Personalized Audiobook Recommendations at Spotify Through Graph Neural Networks Paper • 2403.05185 • Published Mar 8 • 19
Llama2 HQQ Quantized Models Collection LLama2 models quantized using https://github.com/mobiusml/hqq • 6 items • Updated Mar 29 • 5
Mixtral HQQ Quantized Models Collection 4-bit and 2-bit Mixtral models quantized using https://github.com/mobiusml/hqq • 9 items • Updated Mar 29 • 14
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6 • 172
Unifying Vision, Text, and Layout for Universal Document Processing Paper • 2212.02623 • Published Dec 5, 2022 • 10
Zephyr 7B Gemma Collection Models, dataset, and Demo for Zephyr 7B Gemma. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 5 items • Updated Apr 12 • 15
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 566
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models Paper • 2402.13064 • Published Feb 20 • 45
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper • 2402.13753 • Published Feb 21 • 104
YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information Paper • 2402.13616 • Published Feb 21 • 44
Gemma release Collection Groups the Gemma models released by the Google team. • 40 items • Updated 5 days ago • 303
INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning Paper • 2401.06532 • Published Jan 12 • 10
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model Paper • 2402.07827 • Published Feb 12 • 43
Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning Paper • 2402.06619 • Published Feb 9 • 47
The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction Paper • 2312.13558 • Published Dec 21, 2023 • 5
Text to Speech 🗣️ Collection A collection of TTS models supported in 🤗 Transformers. • 4 items • Updated Sep 16, 2023 • 5
Automatic Speech Recognition 📝 Collection A collection of ASR models supported in 🤗 Transformers • 11 items • Updated Sep 16, 2023 • 5
Audio Classification 🔊 Collection A collection of audio classification models supported in 🤗 Transformers • 3 items • Updated Sep 16, 2023 • 3