Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding Paper • 2405.08748 • Published 3 days ago • 13
Embedding Model Datasets Collection A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers • 49 items • Updated 2 days ago • 10
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 11 items • Updated about 16 hours ago • 85
ZeroGPU Spaces Collection ZeroGPU Spaces made by the community • 16 items • Updated about 16 hours ago • 132
Llama3-ChatQA-1.5 Collection Llama3-ChatQA-1.5 models excel at conversational question answering (QA) and retrieval-augmented generation (RAG). • 6 items • Updated 14 days ago • 35
Granite Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 10 items • Updated 6 days ago • 117
🎭 Avatars Collection The latest AI-powered technologies usher in a new era of realistic avatars! 🚀 • 33 items • Updated 4 days ago • 49
CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images Paper • 2310.16825 • Published Oct 25, 2023 • 27
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Paper • 2405.00732 • Published 19 days ago • 107
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published 15 days ago • 92
MetaAI's CodeLlama - Coding Assistant LLM Collection Fast, small, and capable coding model you can run locally on your computer! Requires 8GB+ of RAM. • 4 items • Updated Sep 8, 2023 • 5
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data Paper • 2404.15653 • Published 24 days ago • 24
A Careful Examination of Large Language Model Performance on Grade School Arithmetic Paper • 2405.00332 • Published 17 days ago • 24
git-theta Collection Playing with git-theta: https://github.com/r-three/git-theta • 2 items • Updated 18 days ago • 1
Albert Collection Les différents modèles à jour dans la famille Albert, les modèles archivés n'apparaissent pas dans cette collection. The various models behind Albert • 4 items • Updated 5 days ago • 6
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published 25 days ago • 120
Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Paper • 2402.09844 • Published Feb 15 • 18
〽️MistralAI Collection A collection of MistralAI models that you can trust in production! • 7 items • Updated 8 days ago • 7
AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation Paper • 2404.12753 • Published 29 days ago • 38
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models Paper • 2404.13013 • Published 28 days ago • 26
TextSquare: Scaling up Text-Centric Visual Instruction Tuning Paper • 2404.12803 • Published 29 days ago • 27
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing Paper • 2404.12253 • Published 29 days ago • 50
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated 29 days ago • 520
A little guide to building Large Language Models in 2024 Collection Resources mentioned by @thomwolf in https://x.com/Thom_Wolf/status/1773340316835131757 • 19 items • Updated Apr 1 • 13
view article Article Releasing Youtube-Commons: a massive open corpus for conversational and multimodal data By Pclanglais • 30 days ago • 20
view article Article Orchestration of Experts: The First-Principle Multi-Model System By alirezamsh • Apr 16 • 8
view article Article How to train a new language model from scratch using Transformers and Tokenizers Feb 14, 2020 • 8
Idefics2 🐶 Collection Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated 12 days ago • 76
[lecture artifacts] aligning open language models Collection artifacts referenced in the talk timeline! Slides: https://docs.google.com/presentation/d/1quMyI4BAx4rvcDfk8jjv063bmHg4RxZd9mhQloXpMn0/edit?usp=sharin • 63 items • Updated about 1 month ago • 41
view article Article DS-MoE: Making MoE Models More Efficient and Less Memory-Intensive By bpan • Apr 9 • 26
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models Paper • 2404.07839 • Published Apr 11 • 37
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments Paper • 2404.07972 • Published Apr 11 • 40
ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback Paper • 2404.07987 • Published Apr 11 • 45
DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting Paper • 2404.06903 • Published Apr 10 • 14
view article Article Text2SQL using Hugging Face Dataset Viewer API and Motherduck DuckDB-NSQL-7B Apr 4 • 20