Athene v2 Chat & Agent by NexusFlow - SoTA general LLM fine-tuned from Qwen 2.5 72B excels at Chat + Function Calling/ JSON/ Agents Nexusflow/athene-v2-6735b85e505981a794fb02cc
Orca Agent Instruct by Microsoft - 1 million instruct pairs covering text editing, creative writing, coding, reading comprehension, etc - permissively licensed microsoft/orca-agentinstruct-1M-v1
It's been a while we shipped native quantization support in diffusers 🧨
We currently support bistandbytes as the official backend but using others like torchao is already very simple.
This post is just a reminder of what's possible:
1. Loading a model with a quantization config 2. Saving a model with quantization config 3. Loading a pre-quantized model 4. enable_model_cpu_offload() 5. Training and loading LoRAs into quantized checkpoints
INTRODUCING Hugging Face AutoTrain Client 🔥 Fine-tuning models got even easier!!!! Now you can fine-tune SOTA models on all compatible dataset-model pairs on Hugging Face Hub using Python on Hugging Face Servers. Choose from a number of GPU flavors, millions of models and dataset pairs and 10+ tasks 🤗
To try, install autotrain-advanced using pip. You can ignore dependencies and install without --no-deps and then you'd need to install some dependencies by hand.
Effortlessly stay up-to-date with AI research trends using a new AI tool, "AI Paper Reviewer" !!
It analyzes a list of Hugging Face Daily Papers(w/ @akhaliq) and turn them into insightful blog posts. This project leverages Gemini models (1.5 Pro, 1.5 Flash, and 1.5 Flash-8B) for content generation and Upstage Document Parse for parsing the layout and contents. blog link: https://deep-diver.github.io/ai-paper-reviewer/
Also, here is the link of GitHub repository for parsing and generating pipeline. By using this, you can easily build your own GitHub static pages based on any arXiv papers with your own interest! : https://github.com/deep-diver/paper-reviewer
Smol TTS models are here! OuteTTS-0.1-350M - Zero shot voice cloning, built on LLaMa architecture, CC-BY license! 🔥
> Pure language modeling approach to TTS > Zero-shot voice cloning > LLaMa architecture w/ Audio tokens (WavTokenizer) > BONUS: Works on-device w/ llama.cpp ⚡
Three-step approach to TTS:
> Audio tokenization using WavTokenizer (75 tok per second) > CTC forced alignment for word-to-audio token mapping > Structured prompt creation w/ transcription, duration, audio tokens
The model is extremely impressive for 350M parameters! Kudos to the OuteAI team on such a brilliant feat - I'd love to see this be applied on larger data and smarter backbones like SmolLM 🤗