git-theta Collection Playing with git-theta: https://github.com/r-three/git-theta • 2 items • Updated about 13 hours ago • 1
Albert Collection Les différents modèles à jour dans la famille Albert, les modèles archivés n'apparaissent pas dans cette collection. The various models behind Albert • 2 items • Updated about 16 hours ago • 5
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published 8 days ago • 102
Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Paper • 2402.09844 • Published Feb 15 • 16
〽️MistralAI Collection A collection of MistralAI models that you can trust in production! • 7 items • Updated 4 days ago • 7
AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation Paper • 2404.12753 • Published 12 days ago • 36
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models Paper • 2404.13013 • Published 11 days ago • 26
TextSquare: Scaling up Text-Centric Visual Instruction Tuning Paper • 2404.12803 • Published 12 days ago • 26
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing Paper • 2404.12253 • Published 12 days ago • 45
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated 12 days ago • 443
A little guide to building Large Language Models in 2024 Collection Resources mentioned by @thomwolf in https://x.com/Thom_Wolf/status/1773340316835131757 • 19 items • Updated 29 days ago • 13
view article Article Releasing Youtube-Commons: a massive open corpus for conversational and multimodal data By Pclanglais • 12 days ago • 19
view article Article Orchestration of Experts: The First-Principle Multi-Model System By alirezamsh • 15 days ago • 8
view article Article How to train a new language model from scratch using Transformers and Tokenizers Feb 14, 2020 • 5
Idefics2 🐶 Collection Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 8 items • Updated 13 days ago • 57
[lecture artifacts] aligning open language models Collection artifacts referenced in the talk timeline! Slides: https://docs.google.com/presentation/d/1quMyI4BAx4rvcDfk8jjv063bmHg4RxZd9mhQloXpMn0/edit?usp=sharin • 63 items • Updated 13 days ago • 39
view article Article DS-MoE: Making MoE Models More Efficient and Less Memory-Intensive By bpan • 22 days ago • 25
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models Paper • 2404.07839 • Published 19 days ago • 36
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments Paper • 2404.07972 • Published 19 days ago • 39
ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback Paper • 2404.07987 • Published 19 days ago • 45
DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting Paper • 2404.06903 • Published 21 days ago • 14
view article Article Text2SQL using Hugging Face Dataset Viewer API and Motherduck DuckDB-NSQL-7B 27 days ago • 18
view article Article Making thousands of open LLMs bloom in the Vertex AI Model Garden 21 days ago • 14
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models Paper • 2402.19427 • Published Feb 29 • 47
Advancing LLM Reasoning Generalists with Preference Trees Paper • 2404.02078 • Published 28 days ago • 41
HF-curated models available on Workers AI Collection A collection of models curated with Hugging Face that can be run on Cloudflare's Workers AI serverless inference platform. • 15 items • Updated 29 days ago • 44
Getting it Right: Improving Spatial Consistency in Text-to-Image Models Paper • 2404.01197 • Published 29 days ago • 29
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order Paper • 2404.00399 • Published Mar 30 • 39
DBRX Collection DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27 • 85
Latent Consistency Models LoRAs Collection Latent Consistency Models for Stable Diffusion - LoRAs and full fine-tuned weights • 4 items • Updated Nov 10, 2023 • 94
SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series Paper • 2403.15360 • Published Mar 22 • 11
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference Paper • 2403.14520 • Published Mar 21 • 31
EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba Paper • 2403.09977 • Published Mar 15 • 7