NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models Paper • 2405.17428 • Published 5 days ago • 12
view article Article Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens tokens and 11 languages 9 days ago • 12
Diffusion for World Modeling: Visual Details Matter in Atari Paper • 2405.12399 • Published 12 days ago • 25
view article Article From cloud to developers: Hugging Face and Microsoft Deepen Collaboration 12 days ago • 8
The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models Paper • 2404.05904 • Published Apr 8 • 3
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding Paper • 2405.08748 • Published 18 days ago • 17
Embedding Model Datasets Collection A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers • 51 items • Updated 7 days ago • 24
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 11 items • Updated 15 days ago • 103
ZeroGPU Spaces Collection ZeroGPU Spaces made by the community • 16 items • Updated 15 days ago • 182
Llama3-ChatQA-1.5 Collection Llama3-ChatQA-1.5 models excel at conversational question answering (QA) and retrieval-augmented generation (RAG). • 6 items • Updated 29 days ago • 37
Granite Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 18 items • Updated 2 days ago • 135
🎭 Avatars Collection The latest AI-powered technologies usher in a new era of realistic avatars! 🚀 • 39 items • Updated 2 days ago • 50
CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images Paper • 2310.16825 • Published Oct 25, 2023 • 28
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Paper • 2405.00732 • Published Apr 29 • 115
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published about 1 month ago • 102
MetaAI's CodeLlama - Coding Assistant LLM Collection Fast, small, and capable coding model you can run locally on your computer! Requires 8GB+ of RAM. • 4 items • Updated Sep 8, 2023 • 5
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data Paper • 2404.15653 • Published Apr 24 • 24
A Careful Examination of Large Language Model Performance on Grade School Arithmetic Paper • 2405.00332 • Published May 1 • 24
git-theta Collection Playing with git-theta: https://github.com/r-three/git-theta • 2 items • Updated Apr 30 • 1
Albert Collection Les différents modèles à jour dans la famille Albert, les modèles archivés n'apparaissent pas dans cette collection. The various models behind Albert • 5 items • Updated 3 days ago • 6
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published Apr 22 • 122
Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Paper • 2402.09844 • Published Feb 15 • 19
〽️MistralAI Collection A collection of MistralAI models that you can trust in production! • 10 items • Updated 7 days ago • 7
AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation Paper • 2404.12753 • Published Apr 19 • 38
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models Paper • 2404.13013 • Published Apr 19 • 26
TextSquare: Scaling up Text-Centric Visual Instruction Tuning Paper • 2404.12803 • Published Apr 19 • 27
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing Paper • 2404.12253 • Published Apr 18 • 51
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Apr 18 • 557
A little guide to building Large Language Models in 2024 Collection Resources mentioned by @thomwolf in https://x.com/Thom_Wolf/status/1773340316835131757 • 19 items • Updated Apr 1 • 14
view article Article Releasing Youtube-Commons: a massive open corpus for conversational and multimodal data By Pclanglais • Apr 18 • 20
view article Article Orchestration of Experts: The First-Principle Multi-Model System By alirezamsh • 2 days ago • 13
view article Article How to train a new language model from scratch using Transformers and Tokenizers Feb 14, 2020 • 9
Idefics2 🐶 Collection Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated 26 days ago • 83
[lecture artifacts] aligning open language models Collection artifacts referenced in the talk timeline! Slides: https://docs.google.com/presentation/d/1quMyI4BAx4rvcDfk8jjv063bmHg4RxZd9mhQloXpMn0/edit?usp=sharin • 63 items • Updated Apr 17 • 47
view article Article DS-MoE: Making MoE Models More Efficient and Less Memory-Intensive By bpan • Apr 9 • 26