LLaVa-Interleave Collection LLaVa models that extends the model capabilities to Multi-image, Multi-frame (videos), Multi-patch (single-image) scenarios. β’ 3 items β’ Updated 12 days ago β’ 11
view article Article Experimenting with Automatic PII Detection on the Hub using Presidio 13 days ago β’ 19
view article Article In-browser LLM app in pure Python: Gemini Nano + Gradio-Lite By whitphx β’ 10 days ago β’ 6
AgentInstruct: Toward Generative Teaching with Agentic Flows Paper β’ 2407.03502 β’ Published 19 days ago β’ 34
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Paper β’ 2402.14905 β’ Published Feb 22 β’ 103
Scaling Synthetic Data Creation with 1,000,000,000 Personas Paper β’ 2406.20094 β’ Published 24 days ago β’ 84
LLM Compiler Collection Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. β’ 4 items β’ Updated 25 days ago β’ 140
view article Article Going multimodal: How Prezi is leveraging the Hub and the Expert Support Program to accelerate their ML roadmap Jun 19 β’ 6
Embedding Model Datasets Collection A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers β’ 67 items β’ Updated 19 days ago β’ 52
Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper β’ 2406.14491 β’ Published Jun 20 β’ 76
FP8 LLMs for vLLM Collection Accurate FP8 quantized models by Neural Magic, ready for use with vLLM! β’ 22 items β’ Updated about 18 hours ago β’ 32
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper β’ 2211.05100 β’ Published Nov 9, 2022 β’ 26
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset Paper β’ 2303.03915 β’ Published Mar 7, 2023 β’ 6
Magpie-Pro Datasets (Llama-3) Collection Dataset built with Meta Llama 3 70B. Models are fine-tuned from Llama 3 8B. β’ 5 items β’ Updated about 12 hours ago β’ 14
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback Paper β’ 2406.00888 β’ Published Jun 2 β’ 29
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark Paper β’ 2406.01574 β’ Published Jun 3 β’ 42
sentence-transformers-from-synthetic-data Collection Example of using distilabel to generate synthetic triplets data for fine-tuning a Sentence Transformer model β’ 4 items β’ Updated about 1 month ago β’ 20
view article Article Synthetic dataset generation techniques: generating custom sentence similarity data By davanstrien β’ May 23 β’ 13
view article Article Train custom AI models with the trainer API and adapt them to π€ By not-lain β’ 23 days ago β’ 29
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! β’ 30 items β’ Updated Jun 12 β’ 196
view article Article seemore: Implement a Vision Language Model from Scratch By AviSoori1x β’ 29 days ago β’ 50
TransformerFAM: Feedback attention is working memory Paper β’ 2404.09173 β’ Published Apr 14 β’ 42
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences Paper β’ 2404.03715 β’ Published Apr 4 β’ 58
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order Paper β’ 2404.00399 β’ Published Mar 30 β’ 40
DIBT Prompt collective SPIN Collection This collection contains resources related to the replication of SPIN with the dibt prompt collective dataset β’ 8 items β’ Updated Mar 12 β’ 7
Awesome Document AI Collection A collection of open-source document AI π π π β’ 27 items β’ Updated Mar 11 β’ 43
Pre-trained LMs ES Collection Monolingual language models pre-trained on Spanish and related languages. β’ 20 items β’ Updated 5 days ago β’ 6
Instruction-Tuned Models ES Collection Instruction-tuned models in Spanish and other related languages β’ 7 items β’ Updated 5 days ago β’ 4
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper β’ 2402.13753 β’ Published Feb 21 β’ 108
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models Paper β’ 2402.13064 β’ Published Feb 20 β’ 46
User-LLM: Efficient LLM Contextualization with User Embeddings Paper β’ 2402.13598 β’ Published Feb 21 β’ 18
Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains Paper β’ 2402.05140 β’ Published Feb 6 β’ 19
Instruction-tuned Language Models are Better Knowledge Learners Paper β’ 2402.12847 β’ Published Feb 20 β’ 24
OLMo Suite Collection Artifacts for the first set of OLMo models. β’ 14 items β’ Updated 27 days ago β’ 43
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss Paper β’ 2402.10790 β’ Published Feb 16 β’ 40