Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

mike-ravkineย 
posted an update 3 days ago
view post
Post
3607
Let's talk about one of the hidden gems in the ReasonScape evaluation results, lucky #13: aquif-ai/aquif-3.5-8B-Think

Built on top of the solid Qwen3-8B foundation, aquif-3.5-8B-Think successfully preserves the high performance of the original model while consuming 30-50% less reasoning tokens.

The most notable regression vs the base model here is in arithmetic - if your workload is math heavy this model demonstrates an unfortunate collapse with performance under growing complexity.

The interesting combination of awesome overall performance on SVG simple shapes identification coupled with a total inability to recognize more complex shapes like 'House' or 'Arrow' is a behavior directly inherited from the base model (but with a ~20% improvement in token utilization).

If you like your reasoning models token-efficient, Aquif-3.5-8B-Think is well worth a spin.

Higher resolution, more detailed, interactive plots are available at the m12X explorer: https://reasonscape.com/m12x/explorer/
  • 1 reply
ยท
abdurrahmanbutlerย 
posted an update about 15 hours ago
view post
Post
984
๐ŸŽ‰ I am excited to share news of a project my brother, Umar Butler, and I have been working on for what feels like an eternity now.

๐ˆ๐ง๐ญ๐ซ๐จ๐๐ฎ๐œ๐ข๐ง๐  ๐Œ๐‹๐„๐ โ€” ๐ญ๐ก๐ž ๐Œ๐š๐ฌ๐ฌ๐ข๐ฏ๐ž ๐‹๐ž๐ ๐š๐ฅ ๐„๐ฆ๐›๐ž๐๐๐ข๐ง๐  ๐๐ž๐ง๐œ๐ก๐ฆ๐š๐ซ๐ค.

A suite of 10 high-quality English legal IR datasets, designed by legal experts to set a new standard for comparing embedding models.

Whether youโ€™re exploring legal RAG on your home computer, or running enterprise-scale retrieval, apples-to-apples evaluation is crucial. Thatโ€™s why weโ€™ve open-sourced everything - including our 7 brand-new, hand-crafted retrieval datasets. All of these datasets are now live on Hugging Face.

Any guesses which embedding model leads on legal retrieval?

๐‡๐ข๐ง๐ญ: itโ€™s not OpenAI or Google - they place 7th and 9th on our leaderboard.

To do well on MLEB, embedding models must demonstrate both extensive legal domain knowledge and strong legal reasoning skills.

https://huggingface.co/blog/isaacus/introducing-mleb
  • 1 reply
ยท
adlumalย 
posted an update about 15 hours ago
view post
Post
928
MLEB is the largest, most diverse, and most comprehensive benchmark for legal text embedding models. https://huggingface.co/blog/isaacus/introducing-mleb
wenhuachย 
posted an update 1 day ago
mike-ravkineย 
posted an update about 23 hours ago
view post
Post
803
There are two very interesting reasoning models from ServiceNow-AI that I think are flying under everyone's radar - lets take a closer look at ServiceNow-AI/Apriel-1.5-15b-Thinker (#10 on the ReasonScape rankings) and ServiceNow-AI/Apriel-Nemotron-15b-Thinker (landing just below its brother at #12).

A rather interesting attribute of these models is I have absolutely no idea what they are fine-tuned from, other then some kind of pre-small Mistrals! The non-nemo 15b looks like Mistral Pixtral 12B, but with 8 more layers while the nemo 15b analogously looks like Mistral NeMo 12B but with 10 more layers and a smaller max context length.

The performance trade-offs between these two models are quite clear: the Nemotron provides ~30% shorter answers but at the expense of totally collapsing under difficulty on 4 of the 12 tasks ... which all just happen to have "Math" in common, so it's pretty easy to point the finger at exactly what the price for the lower reasoning token usage is here.

In principle ServiceNow-AI/Apriel-1.5-15b-Thinker is multimodal and should be able to reason about image queries but this is not something I have tried as ReasonScape is not currently able to evaluate VLMs - perhaps a future improvement.
MonsterMMORPGย 
posted an update 1 day ago
view post
Post
752
The Secret to FREE, Local AI Image Generation is Finally Here

Tutorial video : https://youtu.be/c3gEoAyL2IE

๐ŸŽจ Stop struggling with complex AI image generation! In this tutorial, I reveal the ultimate, one-click solution to creating stunning, photorealistic, and stylized AI art LOCALLY on your own computerโ€”for FREE.

Tired of confusing workflows, endless command lines, and expensive subscriptions? Forget everything you know about Stable Diffusion and ComfyUI's complexity. I'm introducing you to SwarmUI, the revolutionary tool that leverages the power of ComfyUI's backend with an incredibly simple, user-friendly interface.

This isn't just another AI tutorial. This is a complete, all-in-one guide that takes you from ZERO to AI Art PRO in minutes. I provide a one-click installer that sets up everything you need, including pre-configured presets for achieving breathtaking realism and incredible stylization.

๐Ÿ”ฅ In This Video, You Will Discover:

The Easiest AI Art Install Ever: A step-by-step guide using my custom one-click installer for ComfyUI and SwarmUI. No coding or tech skills required!

Unlock Top-Tier Realism: Learn how to use my secret presets to generate images so realistic, you won't believe they were made on your local PC.

Master Image Editing & Upscaling: Go beyond simple generation. I'll show you how to edit images with simple text commands and upscale them to glorious 4K/8K quality.

The Secret to Speed: See how this setup optimizes performance for your specific GPU, whether you have a high-end or mid-range card.

Say Goodbye to Subscriptions: Learn how to harness the full power of state-of-the-art AI models without ever paying a monthly fee again.

This is the new standard for local AI image generation. Whether you're a complete beginner or an experienced artist, this video will change your entire workflow. It's time to unleash the full creative power of your computer.
  • 1 reply
ยท
prithivMLmodsย 
posted an update 1 day ago
view post
Post
824
Now you can try all the latest state-of-the-art multimodal vision-language models from the Qwen3-VL series demo on Hugging Face Spaces โ€” including 4B, 8B, and 30B (Instruct, 4B-Thinking) variants. Iโ€™ve also uploaded the weights for the Abliterated variants of these models, up to 30B parameters. Check out the Spaces and model links below! ๐Ÿค—๐Ÿ”ฅ

โœจ Qwen3-VL[4B,8B]: prithivMLmods/Qwen3-VL-Outpost
โœจ Qwen3-VL-30B-A3B-Demo: prithivMLmods/Qwen3-VL-HF-Demo
โœจ Collection: prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0

Qwen3-VL Abliterated Model Collection [ Version 1.0 ]

โœจ Qwen3-VL-8B-Instruct-abliterated: prithivMLmods/Qwen3-VL-8B-Instruct-abliterated
โœจ Qwen3-VL-4B-Instruct-abliterated: prithivMLmods/Qwen3-VL-4B-Instruct-abliterated
โœจ Qwen3-VL-8B-Thinking-abliterated: prithivMLmods/Qwen3-VL-8B-Thinking-abliterated
โœจ Qwen3-VL-4B-Thinking-abliterated: prithivMLmods/Qwen3-VL-4B-Thinking-abliterated
โœจ Qwen3-VL-30B-A3B-Instruct-abliterated: prithivMLmods/Qwen3-VL-30B-A3B-Instruct-abliterated

โšกCollection: prithivMLmods/qwen3-vl-abliteration-oct-1625-68f0e3e567ef076594605fac

Note: This is version 1.0 of the Abliteration of the Qwen3-VL series of models. It may perform sub-optimally in some cases. If you encounter any issues, please open a discussion.
nick007xย 
posted an update 1 day ago
view post
Post
729
๐Ÿ‘‹ Hey i have Just uploaded 2 new datasets for code and scientific reasoning models:

1. ArXiv Papers (4.6TB) A massive scientific corpus with papers and metadata across all domains.Perfect for training models on academic reasoning, literature review, and scientific knowledge mining. ๐Ÿ”—Link: nick007x/arxiv-papers

2. GitHub Code 2025 (1 TB)a comprehensive code dataset for code generation and analysis tasks. mostly contains GitHub's high quality top 1 million repos above 2 stars ๐Ÿ”—Link: nick007x/github-code-2025
TravisMuhlesteinย 
posted an update 2 days ago
view post
Post
870
Building AI Agents from First Principles at GoDaddy

Everyoneโ€™s talking about AI agents lately, and for good reason. But at GoDaddy, weโ€™re going deeper: starting from first principles to explore what makes an agent truly robust and usable in real-world scenarios.

Instead of asking โ€œWhat can we build fast?โ€ weโ€™re asking โ€œWhat design choices make agents flexible, testable, and reliable long term?โ€

Core Concepts

โ€ข Tool-centric design: everything an agent does is a tool call, with precise APIs and granularity.
โ€ข Decision vs. delivery: agents decide what to do; tools handle how to do itโ€”keeping systems modular.
โ€ข Structured outputs & reflection: LLMs output both the tool call and the reason behind it, making debugging and iteration easier.
โ€ข Universal tools: even user interactions (inform, confirm, request) are abstracted as tools, clarifying boundaries between logic and interface.

Real-world use cases โ†’ Not just theory

โœ…Routing and responding to support messages
โœ…Surfacing emerging trends in sales data
โœ…Automating scheduling, inventory, or operations orchestration

What we learned

โ€ข Treating everything as a tool makes systems more predictable and extensible
โ€ข LLM โ€œverbosityโ€ is valuableโ€”it reveals reasoning and speeds iteration
โ€ข Separating decision from execution reduces fragility and simplifies updates

Weโ€™re still at the beginning, but these principles give us a strong foundation. As agents evolve, architectural clarity matters more than chasing the latest framework.

๐Ÿ‘‰ Curious about architecture patterns that scale? Dive in here: Building AI Agents at GoDaddy: An Experiment in First Principles https://www.godaddy.com/resources/news/building-ai-agents-at-godaddy-an-experiment-in-first-principles
ZennyKennyย 
posted an update 2 days ago
view post
Post
2061
Did Hugging Face just ban hammer a bunch of bot accounts or am I just so uninteresting that 30% of my subs dropped me overnight?

๐Ÿ˜ฌ Wait, don't answer that.
  • 2 replies
ยท