Ninefid

den0620

den0620

AI & ML interests

I just examine models

Recent Activity

upvoted a paper 10 days ago

Modifying Large Language Model Post-Training for Diverse Creative Writing

upvoted a paper 10 days ago

ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning

liked a model about 1 month ago

mistralai/Mistral-Small-3.1-24B-Instruct-2503

View all activity

Organizations

None yet

den0620's activity

upvoted 2 papers 10 days ago

Modifying Large Language Model Post-Training for Diverse Creative Writing

Paper • 2503.17126 • Published Mar 21 • 36

ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning

Paper • 2503.19470 • Published 29 days ago • 17

liked a model about 1 month ago

mistralai/Mistral-Small-3.1-24B-Instruct-2503

Image-Text-to-Text • Updated 15 days ago • 98.4k • • 1.15k

reacted to mlabonne's post with 👍 about 1 month ago

Post

9153

✂️ AutoAbliteration

I made a Colab notebook to automatically abliterate models.

It's quite general, so you can do interesting stuff like blocking a given language in the model outputs.

💻 Colab: https://colab.research.google.com/drive/1RmLv-pCMBBsQGXQIM8yF-OdCNyoylUR1?usp=sharing

liked 9 models about 1 month ago

upvoted a collection about 1 month ago

Gemma 3 Release

Collection

24 items • Updated 5 days ago • 342

liked 2 models about 1 month ago

Wan-AI/Wan2.1-T2V-14B

Text-to-Video • Updated Mar 12 • 52.8k • • 1.23k

DavidAU/Qwen2.5-QwQ-35B-Eureka-Cubed-abliterated-uncensored-gguf

Updated Mar 9 • 15k • 57

liked a model 3 months ago

mistralai/Mistral-Small-24B-Instruct-2501

Text Generation • Updated Feb 2 • 925k • • 897

reacted to onekq's post with 🚀 3 months ago

Post

1335

Mistral Small 3 is SUPER fast, and highest score for 20+b model, but still 11 points below Qwen 2.5 coder 32b.

I believe specialty model is the future. The more you know what to do with the model, the better bang you can get for your buck. If Mistral scopes this small model to coding only, I'm confident they can beat Qwen.

One day my leaderboard will be dominated by smol models excellent on one thing, not monolithic ones costing $$$. And I'm looking forward to that.

onekq-ai/WebApp1K-models-leaderboard

1 reply

reacted to fdaudens's post with ❤️ 3 months ago

Post

9202

Yes, DeepSeek R1's release is impressive. But the real story is what happened in just 7 days after:

- Original release: 8 models, 540K downloads. Just the beginning...

- The community turned those open-weight models into +550 NEW models on Hugging Face. Total downloads? 2.5M—nearly 5X the originals.

The reason? DeepSeek models are open-weight, letting anyone build on top of them. Interesting to note that the community focused on quantized versions for better efficiency & accessibility. They want models that use less memory, run faster, and are more energy-efficient.

When you empower builders, innovation explodes. For everyone. 🚀

The most popular community model? @bartowski 's DeepSeek-R1-Distill-Qwen-32B-GGUF version — 1M downloads alone.

5 replies

liked a model 4 months ago

hexgrad/Kokoro-82M

Text-to-Speech • Updated 13 days ago • 1.95M • 4.1k