Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

wolfram 
posted an update 1 day ago
view post
Post
3319
Finally finished my extensive **Qwen 3 evaluations** across a range of formats and quantisations, focusing on **MMLU-Pro** (Computer Science).

A few take-aways stood out - especially for those interested in local deployment and performance trade-offs:

1️⃣ **Qwen3-235B-A22B** (via Fireworks API) tops the table at **83.66%** with ~55 tok/s.
2️⃣ But the **30B-A3B Unsloth** quant delivered **82.20%** while running locally at ~45 tok/s and with zero API spend.
3️⃣ The same Unsloth build is ~5x faster than Qwen's **Qwen3-32B**, which scores **82.20%** as well yet crawls at <10 tok/s.
4️⃣ On Apple silicon, the **30B MLX** port hits **79.51%** while sustaining ~64 tok/s - arguably today's best speed/quality trade-off for Mac setups.
5️⃣ The **0.6B** micro-model races above 180 tok/s but tops out at **37.56%** - that's why it's not even on the graph (50 % performance cut-off).

All local runs were done with LM Studio on an M4 MacBook Pro, using Qwen's official recommended settings.

**Conclusion:** Quantised 30B models now get you ~98 % of frontier-class accuracy - at a fraction of the latency, cost, and energy. For most local RAG or agent workloads, they're not just good enough - they're the new default.

Well done, Qwen - you really whipped the llama's ass! And to OpenAI: for your upcoming open model, please make it MoE, with toggleable reasoning, and release it in many sizes. *This* is the future!
·
DawnC 
posted an update about 19 hours ago
view post
Post
2177
VisionScout — Now with Video Analysis! 🚀

I’m excited to announce a major update to VisionScout, my interactive vision tool that now supports VIDEO PROCESSING, in addition to powerful object detection and scene understanding!

⭐️ NEW: Video Analysis Is Here!
🎬 Upload any video file to detect and track objects using YOLOv8.
⏱️ Customize processing intervals to balance speed and thoroughness.
📊 Get comprehensive statistics and summaries showing object appearances across the entire video.

What else can VisionScout do?

🖼️ Analyze any image and detect 80 object types with YOLOv8.
🔄 Switch between Nano, Medium, and XLarge models for speed or accuracy.
🎯 Filter by object classes (people, vehicles, animals, etc.) to focus on what matters.
📊 View detailed stats on detections, confidence levels, and distributions.
🧠 Understand scenes — interpreting environments and potential activities.
⚠️ Automatically identify possible safety concerns based on detected objects.

What’s coming next?
🔎 Expanding YOLO’s object categories.
⚡ Faster real-time performance.
📱 Improved mobile responsiveness.

My goal:
To bridge the gap between raw detection and meaningful interpretation.
I’m constantly exploring ways to help machines not just "see" but truly understand context — and to make these advanced tools accessible to everyone, regardless of technical background.

Try it now! 🖼️👉 DawnC/VisionScout

If you enjoy VisionScout, a ❤️ Like for this project or feedback would mean a lot and keeps me motivated to keep building and improving!

#ComputerVision #ObjectDetection #VideoAnalysis #YOLO #SceneUnderstanding #MachineLearning #TechForLife
  • 2 replies
·
RiverZ 
posted an update 3 days ago
view post
Post
5287
🔥 We're thrilled to share some exciting news about ICEdit! Currently, ICEdit app ( RiverZ/ICEdit) has soared to the second place on the weekly trend list of Hugging Face Space, just trailing behind Qwen3. What's more, it also holds the second position on the overall space trend list. This achievement wouldn't have been possible without your incredible support and love. A huge thank you to each and every one of you❤!

🎉 The ICEdit community has been incredibly active, and we've seen a plethora of amazing ComfyUI workflows being shared. For instance, with the help of ComfyUI - nunchaku, you can run ICEdit locally with just 4GB of VRAM. This makes it much more accessible for those with limited hardware resources.

🎇 If you're interested in the detailed information, please head over to our repository. We highly encourage you to give these workflows a try and explore the creative possibilities that ICEdit offers.

Github Repo: https://github.com/River-Zhang/ICEdit
Hugging Face Space: RiverZ/ICEdit
giadap 
posted an update 2 days ago
view post
Post
2909
Ever notice how some AI assistants feel like tools while others feel like companions? Turns out, it's not always about fancy tech upgrades, because sometimes it's just clever design.

Our latest blog post at Hugging Face dives into how minimal design choices can completely transform how users experience AI. We've seen our community turn the same base models into everything from swimming coaches to interview prep specialists with surprisingly small tweaks.

The most fascinating part? When we tested identical models with different "personalities" in our Inference Playground, the results were mind-blowing.

Want to experiment yourself? Our Inference Playground lets anyone (yes, even non-coders!) test these differences in real-time. You can:

- Compare multiple models side-by-side
- Customize system prompts
- Adjust parameters like temperature
- Test multi-turn conversations

It's fascinating how a few lines of instruction text can transform the same AI from strictly professional to seemingly caring and personal, without changing a single line of code in the model itself.

Read more here: https://huggingface.co/blog/giadap/ai-personas
merve 
posted an update 3 days ago
view post
Post
4438
A ton of impactful models and datasets in open AI past week, let's summarize the best 🤩 merve/releases-apr-21-and-may-2-6819dcc84da4190620f448a3

💬 Qwen made it rain! They released Qwen3: new dense and MoE models ranging from 0.6B to 235B 🤯 as well as Qwen2.5-Omni, any-to-any model in 3B and 7B!
> Microsoft AI released Phi4 reasoning models (that also come in mini and plus sizes)
> NVIDIA released new CoT reasoning datasets
🖼️ > ByteDance released UI-TARS-1.5, native multimodal UI parsing agentic model
> Meta released EdgeTAM, an on-device object tracking model (SAM2 variant)
🗣️ NVIDIA released parakeet-tdt-0.6b-v2, a smol 600M automatic speech recognition model
> Nari released Dia, a 1.6B text-to-speech model
> Moonshot AI released Kimi Audio, a new audio understanding, generation, conversation model
👩🏻‍💻 JetBrains released Melium models in base and SFT for coding
> Tesslate released UIGEN-T2-7B, a new text-to-frontend-code model 🤩
VirtualOasis 
posted an update 3 days ago
view post
Post
2804
Agents vs. Workflows
Agents are systems where LLMs dynamically direct their processes and tool usage, maintaining control over how they accomplish tasks.
Workflows are through predefined code paths, ensuring that each step is executed in a deterministic manner.

Agents are like smart assistants that can think on their own. They understand situations, make decisions, and act, whatever the task is new or unpredictable. Think of the Agent as a chef who can make a meal based on what they have.

Workflows are like a recipe with fixed steps. They’re a series of tasks done in order, like following a checklist for approving a loan. They’re great for tasks that don’t change much.
sequelbox 
posted an update 2 days ago
view post
Post
2332
NEW RELEASE: Esper 3 for Qwen 3!

- A full-stack software assistant: a reasoning finetune focused on coding, architecture, and DevOps using the Titanium and Tachibana datasets!
- Improved general and creative reasoning skills, powered by the Raiden dataset.

4B model: ValiantLabs/Qwen3-4B-Esper3
8B model: ValiantLabs/Qwen3-8B-Esper3

We'll also be bringing Esper 3 to larger Qwen 3 models as soon as we can - if you want these, consider helping us out: sequelbox/SupportOpenSource

More models and datasets to come soon!

with my love and enthusiasm,
allegra
  • 1 reply
·
merve 
posted an update 4 days ago
view post
Post
6223
A real-time object detector much faster and accurate than YOLO with Apache 2.0 license just landed to Hugging Face transformers 🔥

D-FINE is the sota real-time object detector that runs on T4 (free Colab) 🤩

> Collection with all checkpoints and demo ustc-community/d-fine-68109b427cbe6ee36b4e7352

Notebooks:
> Tracking https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_tracking.ipynb
> Inference https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_inference.ipynb
> Fine-tuning https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_finetune_on_a_custom_dataset.ipynb
h/t @vladislavbro @qubvel-hf @ariG23498 and the authors of the paper 🎩

Regular object detectors attempt to predict bounding boxes in (x, y, w, h) pixel perfect coordinates, which is very rigid and hard to solve 🥲☹️



D-FINE formulates object detection as a distribution for bounding box coordinates, refines them iteratively, and it's more accurate 🤩

Another core idea behind this model is Global Optimal Localization Self-Distillation ⤵️

this model uses final layer's distribution output (sort of like a teacher) to distill to earlier layers to make early layers more performant.

  • 2 replies
·
clem 
posted an update about 14 hours ago
prithivMLmods 
posted an update 2 days ago
view post
Post
2878
Well, here’s the updated version with the 20,000+ entry sampled dataset for Watermark Filter Content Moderation models incl. [Food25, Weather, Watermark, Marathi/Hindi Sign Language Detection], post-trained from the base models: sigLip2 patch16 224 — now with mixed aspect ratios for better performance and reduced misclassification. 🔥

Models :
➮ Watermark-Detection : prithivMLmods/Watermark-Detection-SigLIP2
⌨︎ Watermark Detection & Batch Image Processing Experimentals, Colab Notebook : https://colab.research.google.com/drive/1mlQrSsSjkGimUt0VyRi3SoWMv8OMyvw3?usp=drive_link
➮ Weather-Image-Classification : prithivMLmods/Weather-Image-Classification
➮ TurkishFoods-25 : prithivMLmods/TurkishFoods-25
➮ Marathi-Sign-Language-Detection : prithivMLmods/Marathi-Sign-Language-Detection
➮ Hindi-Sign-Language-Detection : prithivMLmods/Hindi-Sign-Language-Detection

Datasets :
Watermark : qwertyforce/scenery_watermarks
Weather : prithivMLmods/WeatherNet-05-18039
Turkish Foods 25 : yunusserhat/TurkishFoods-25
Marathi Sign Language : VinayHajare/Marathi-Sign-Language
Hindi Sign Language : Vedant3907/Hindi-Sign-Language-Dataset

Collection : prithivMLmods/content-filters-siglip2-vit-68197e3357d4de18fb3b4d2b