Molmo Collection Artifacts for open multimodal language models. β’ 5 items β’ Updated 16 days ago β’ 243
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. β’ 45 items β’ Updated 23 days ago β’ 242
Open Whisper-style Speech Models (OWSM) Collection Fully open Whisper-style speech foundation models developed by CMU WAVLab: https://www.wavlab.org/activities/2024/owsm/ β’ 15 items β’ Updated 14 days ago β’ 3
Qwen2-VL Collection Vision-language model series based on Qwen2 β’ 15 items β’ Updated 24 days ago β’ 133
YOLOv10 Collection This collection hosts the YOLOv10 model releases β’ 16 items β’ Updated Jun 3 β’ 16
βοΈ Llama-3.1 Storm Models Collection Fine-tuned Llama 3.1 8B model with superior reasoning, conversation abilities, and function calling! β’ 3 items β’ Updated Aug 25 β’ 15
Qwen2-Audio Collection Audio-language model series based on Qwen2 β’ 4 items β’ Updated 24 days ago β’ 41
Qwen2-Math Collection Math-specific model series based on Qwen2 β’ 8 items β’ Updated 24 days ago β’ 45
Gemma 2 2B Release Collection The 2.6B parameter version of Gemma 2. β’ 6 items β’ Updated Jul 31 β’ 76
Llama 3.1 GPTQ, AWQ, and BNB Quants Collection Optimised Quants for high-throughput deployments! Compatible with Transformers, TGI & VLLM π€ β’ 9 items β’ Updated 16 days ago β’ 52
NuminaMath Collection Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize β’ 6 items β’ Updated Jul 21 β’ 58
LLaVa-Interleave Collection LLaVa models that extends the model capabilities to Multi-image, Multi-frame (videos), Multi-patch (single-image) scenarios. β’ 3 items β’ Updated Jul 10 β’ 14
InternVL 2.0 Collection Expanding Performance Boundaries of Open-Source MLLM β’ 16 items β’ Updated Aug 10 β’ 73
πͺ SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos β’ 12 items β’ Updated Aug 18 β’ 174
BigVGAN Collection BigVGAN is a universal neural vocoder that generates audio waveform using mel spectrogram as input. β’ 11 items β’ Updated 11 days ago β’ 9
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models Jun 24 β’ 173
Nemotron 4 340B Collection Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. β’ 4 items β’ Updated 11 days ago β’ 156
Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. β’ 39 items β’ Updated 24 days ago β’ 341
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. β’ 27 items β’ Updated 23 days ago β’ 474
view article Article Enjoy the Power of Phi-3 with ONNX Runtime on your device By Emma-N β’ May 22 β’ 25
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma β’ 16 items β’ Updated Jul 31 β’ 137
view article Article Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints May 1 β’ 65
LLaVA++ (LLaMA-3 and Phi-3-Mini) Collection Extending Visual Capabilities of LLaVA with LLaMA-3 and Phi-3 β’ 11 items β’ Updated Jun 11 β’ 23
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper β’ 2404.14219 β’ Published Apr 22 β’ 251
Parler-TTS: fully open-source high-quality TTS Collection If you want to find out more about how these models were trained and even fine-tune them yourself, check-out the Parler-TTS repository on GitHub. β’ 7 items β’ Updated Aug 8 β’ 44
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models Paper β’ 2403.13372 β’ Published Mar 20 β’ 58
Luganda Whisper ASR Collection Luganda Speech To Text/ Automatic Speech Recognition β’ 4 items β’ Updated Mar 17 β’ 1
Awesome Document AI Collection A collection of open-source document AI π π π β’ 27 items β’ Updated Mar 11 β’ 72
Transformers compatible Mamba Collection This release includes the `mamba` repositories compatible with the `transformers` library β’ 5 items β’ Updated Mar 6 β’ 35
Quyen Collection State-of-the-arts General LLMs - based on Qwen1.5 β’ 26 items β’ Updated Feb 13 β’ 12
π΅ The MusicBox Collection A collection full of musical tasks demos, for musicians & music enthusiasts β’ 28 items β’ Updated 9 days ago β’ 19
Qwen1.5 Collection Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. β’ 55 items β’ Updated 24 days ago β’ 206
StemGen: A music generation model that listens Paper β’ 2312.08723 β’ Published Dec 14, 2023 β’ 47
Seamless Communication Collection A significant step towards removing language barriers through expressive, fast and high-quality AI translation. β’ 16 items β’ Updated Jan 16 β’ 146