Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models Paper • 2404.18796 • Published Apr 29 • 67
TabuLa-8B Collection Training, eval suite, and model from the paper "Large Scale Transfer Learning for Tabular Data via Language Modeling" https://arxiv.org/abs/2406.12031 • 4 items • Updated 2 days ago • 7
Depth Anything v2 Release Collection A comprehensive collection on DAv2 • 5 items • Updated 3 days ago • 8
FP8 LLMs for vLLM Collection Accurate FP8 quantized models by Neural Magic, ready for use with vLLM! • 10 items • Updated 1 day ago • 15
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing Paper • 2406.08464 • Published 8 days ago • 42
codestral-text2cypher Collection codestral finetuned for text2cypher • 3 items • Updated 11 days ago • 2
Local Function Calling Gems Collection These are the best function calling LLMs one can run on less than 64GB VRAM/Unified Memory. I use these on a M1 Max Macbook 64GB. • 5 items • Updated 15 days ago • 3
Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 29 items • Updated 15 days ago • 195
DeTikZify Collection Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ • 9 items • Updated 18 days ago • 2
view article Article Releasing Common Corpus: the largest public domain dataset for training LLMs By Pclanglais • Mar 20 • 12
view article Article How to directly access 150k+ Hugging Face Datasets with DuckDB and query using GPT-4o By chilijung • 21 days ago • 10
view article Article ⚗️ 🔥 Building High-Quality Datasets with distilabel and Prometheus 2 By burtenshaw • 17 days ago • 20
sentence-transformers-from-synthetic-data Collection Example of using distilabel to generate synthetic triplets data for fine-tuning a Sentence Transformer model • 4 items • Updated 10 days ago • 18
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 24 days ago • 100
Granite Code Models: A Family of Open Foundation Models for Code Intelligence Paper • 2405.04324 • Published May 7 • 14
view article Article GPU Poor Savior: Revolutionizing Low-Bit Open Source LLMs and Cost-Effective Edge Computing By NicoNico • 27 days ago • 9
DiscoLeo 8B: Llama3 for German Collection Continued Pretraining on Llama3 8B to improve German linguistic capabilities. A collection of base and fine-tuned models and variants. • 5 items • Updated 27 days ago • 13
DiscoLeo 8B quants Collection A collection of different quantizations of the DiscoLeo models. • 3 items • Updated 27 days ago • 3
C4AI Aya 23 Collection Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. • 3 items • Updated 29 days ago • 34
view article Article ⚗️ 🧑🏼🌾 Let's grow some Domain Specific Datasets together By burtenshaw • Apr 29 • 27
C4AI Command R Plus Collection C4AI Command R+ is an open weights research release of a 104B billion parameter model with highly advanced capabilities. • 3 items • Updated 29 days ago • 20
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 22 items • Updated 21 days ago • 341
CommonCatalog Collection Common Catalog, a dataset with Creative Commons licensed images and machine-generated caption pairs • 8 items • Updated May 16 • 11
M2-BERT Embeddings Collection Models and Datasets for M2-BERT and LoCoV1 • 10 items • Updated about 1 month ago • 2
Granite Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 18 items • Updated 21 days ago • 145
view article Article Saving Memory Using Padding-Free Transformer Layers during Finetuning By mayank-mishra • 10 days ago • 6
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15 • 142
view article Article 🧑⚖️ "Replacing Judges with Juries" using distilabel By alvarobartt • May 3 • 14
llama 3 self-align experiments Collection Replicating the pipeline for StarCoder-2 Instruct on Llama-3-8B with some tweaks https://huggingface.co/blog/sc2-instruct • 4 items • Updated May 9 • 6
view article Article StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation Apr 29 • 69
view article Article Post-OCR-Correction: 1 billion words dataset of automated OCR correction by LLM By Pclanglais • Apr 26 • 12
view article Article LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!) By wolfram • Apr 24 • 51
〽️MistralAI Collection A collection of MistralAI models that you can trust in production! • 10 items • Updated 18 days ago • 7
Zeroshot Classifiers Collection These are my current best zeroshot classifiers. Some of my older models are downloaded more often, but the models in this collection are newer/better. • 11 items • Updated Apr 3 • 82
LLM Speculative Decoding Collection Tiny language models meant to serve as draft models for speculative decoding. • 6 items • Updated Jan 6 • 2
Recent models: last 100 repos, sorted by creation date Collection The last 100 repos I have created. Sorted by creation date descending, so the most recently created repos appear at the top. • 121 items • Updated Jan 31 • 455