Evaluating Language Models as Synthetic Data Generators Paper β’ 2412.03679 β’ Published 10 days ago β’ 39
BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices Paper β’ 2411.10640 β’ Published 29 days ago β’ 44
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M β’ 15 items β’ Updated 12 days ago β’ 191
Zeroshot Classifiers Collection These are my current best zeroshot classifiers. Some of my older models are downloaded more often, but the models in this collection are newer/better. β’ 11 items β’ Updated Apr 3 β’ 113
OLMoE Collection Artifacts for open mixture-of-experts language models. β’ 13 items β’ Updated 17 days ago β’ 27
view article Article A failed experiment: Infini-Attention, and why we should keep trying? Aug 14 β’ 51
Gemma 2: Improving Open Language Models at a Practical Size Paper β’ 2408.00118 β’ Published Jul 31 β’ 75
Bad Data Toolbox Collection PleIAs collection of models for the data processing of challenging document and data sources. β’ 5 items β’ Updated Jul 18 β’ 15
view article Article Advanced RAG: Fine-Tune Embeddings from HuggingFace for RAG By lucifertrj β’ Jul 5 β’ 4
GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks Paper β’ 2406.12925 β’ Published Jun 14 β’ 23
A little guide to building Large Language Models in 2024 Collection Resources mentioned by @thomwolf in https://x.com/Thom_Wolf/status/1773340316835131757 β’ 19 items β’ Updated Apr 1 β’ 14