view article Article How to directly access 150k+ Hugging Face Datasets with DuckDB and query using GPT-4o By chilijung • 1 day ago • 5
view article Article ⚗️ 🔥 Building High-Quality Datasets with distilabel and Prometheus 2 By burtenshaw • 4 days ago • 20
sentence-transformers-from-synthetic-data Collection Example of using distilabel to generate synthetic triplets data for fine-tuning a Sentence Transformer model • 3 items • Updated 1 day ago • 15
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 5 days ago • 61
Granite Code Models: A Family of Open Foundation Models for Code Intelligence Paper • 2405.04324 • Published 26 days ago • 14
view article Article GPU Poor Savior: Revolutionizing Low-Bit Open Source LLMs and Cost-Effective Edge Computing By NicoNico • 8 days ago • 9
DiscoLeo 8B: Llama3 for German Collection Continued Pretraining on Llama3 8B to improve German linguistic capabilities. A collection of base and fine-tuned models and variants. • 5 items • Updated 8 days ago • 12
DiscoLeo 8B quants Collection A collection of different quantizations of the DiscoLeo models. • 3 items • Updated 8 days ago • 3
C4AI Aya 23 Collection Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. • 3 items • Updated 10 days ago • 34
view article Article ⚗️ 🧑🏼🌾 Let's grow some Domain Specific Datasets together By burtenshaw • Apr 29 • 27
C4AI Command R Plus Collection C4AI Command R+ is an open weights research release of a 104B billion parameter model with highly advanced capabilities. • 3 items • Updated 10 days ago • 18
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 22 items • Updated 2 days ago • 299
CommonCatalog Collection Common Catalog, a dataset with Creative Commons licensed images and machine-generated caption pairs • 8 items • Updated 16 days ago • 7
M2-BERT Embeddings Collection Models and Datasets for M2-BERT and LoCoV1 • 10 items • Updated 11 days ago • 2
Granite Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 18 items • Updated 2 days ago • 135
view article Article Saving Memory Using Padding-Free Transformer Layers during Finetuning By mayank-mishra • Mar 9 • 4
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15 • 134
view article Article 🧑⚖️ "Replacing Judges with Juries" using distilabel By alvarobartt • 29 days ago • 14
llama 3 self-align experiments Collection Replicating the pipeline for StarCoder-2 Instruct on Llama-3-8B with some tweaks https://huggingface.co/blog/sc2-instruct • 4 items • Updated 23 days ago • 6
view article Article StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation Apr 29 • 69
view article Article Post-OCR-Correction: 1 billion words dataset of automated OCR correction by LLM By Pclanglais • Apr 26 • 10
view article Article LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!) By wolfram • Apr 24 • 48
〽️MistralAI Collection A collection of MistralAI models that you can trust in production! • 10 items • Updated 7 days ago • 7
Zeroshot Classifiers Collection These are my current best zeroshot classifiers. Some of my older models are downloaded more often, but the models in this collection are newer/better. • 11 items • Updated Apr 3 • 80
LLM Speculative Decoding Collection Tiny language models meant to serve as draft models for speculative decoding. • 6 items • Updated Jan 6 • 2
Recent models: last 100 repos, sorted by creation date Collection The last 100 repos I have created. Sorted by creation date descending, so the most recently created repos appear at the top. • 121 items • Updated Jan 31 • 448