Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

lysandre 
posted an update 2 days ago
view post
Post
4918
We're kick-starting the process of Transformers v5, with @ArthurZ and @cyrilvallez !

v5 should be significant: we're using it as a milestone for performance optimizations, saner defaults, and a much cleaner code base worthy of 2025.

Fun fact: v4.0.0-rc-1 came out on Nov 19, 2020, nearly five years ago!
·
mitkox 
posted an update 2 days ago
view post
Post
5282
I’ve built my blocker for AI-generated content. It’s a local AI running on my laptop with a browser extension that classifies and scrubs synthetic content from my eyeballs. I’m too old for this synthetic noise.

TL;DR I’m going full John Connor on the AI content apocalypse

Think of it as an on device AI ad-blocker, but for:
Em-dash overdose. Seriously, why is everything suddenly revolutionary—disruptive—life-changing?
AI influencers’ auto-generated posts and images, auto-posted, all hands-free.
Fake news, fake images, fake people... puff.

Surprisingly, it works. I suppose it will block some human-generated content. However, I would rather read a 2007 Myspace blog than another “10 Growth Hacks Powered By ChatGPT” post.
·
prithivMLmods 
posted an update 1 day ago
view post
Post
1594
I'm a Hugging Face Fellow now, guys!🤗❤️

With the same passion, trust, and momentum to contribute to the community, I’m excited to do some amazing things to wrap up Q3 and Q4 of 2025. And importantly, I’ve been lucky enough to receive some knowledge and guidance from @merve to build open-source demos and stuff. Thank you for the belief.

Thank you — much love.
Long live open source!

— Prithiv
ZennyKenny 
posted an update 1 day ago
view post
Post
1589
The open source Synthetic Data SDK from MOSTLY AI: mostlyai offers the ability to generate realistic, privacy-safe synthetic data with just a few lines of Python.

Try it out yourself in a No Code UI in the SDK Demo Space: mostlyai/synthetic-sdk-demo
aposadasn 
posted an update 2 days ago
view post
Post
2362
My team at arclabmit created a robotic teleoperation and learning software for controlling robots, recording datasets, and training physical AI models, which is compatible with lerobot . This work was part of a paper we published to ICCR Kyoto 2025. Check out or code here: https://github.com/ARCLab-MIT/beavr-bot/tree/main

Our work aims to solve two key problems in the world of robotic manipulation:

1. The lack of a well-developed, open-source, accessible teleoperation system that can work out of the box.

2. No performant end-to-end control, recording, and learning platform for robots that is completely hardware agnostic.

If you are curious to learn more or have any questions please feel free to reach out!

Paper: BEAVR: Bimanual, multi-Embodiment, Accessible, Virtual Reality Teleoperation System for Robots (2508.09606)
  • 1 reply
·
salma-remyx 
posted an update about 22 hours ago
view post
Post
1459
Search is such a fundamental part of content discovery, yet ends up overlooked or poorly implemented in so many apps we use every day.

We built hundreds of Docker images for arXiv papers with a codebase - it's tough to find what you're looking for unless you happen to have the arXiv id handy using DockerHub's search.

So we added full text search over these resources so that you're that much closer to testing a new promising idea. More resources to be indexed soon!

Full Demo: https://www.youtube.com/watch?v=GjYReWbQZw8
Try it here!: https://engine.remyx.ai/resources
Join us at Experiment 2025!: https://experiment.remyx.ai
meg 
posted an update about 24 hours ago
view post
Post
1466
🤖 As AI-generated content is shared in movies/TV/across the web, there's one simple low-hanging fruit 🍇 to help know what's real: Visible watermarks. With the Gradio team, I've made sure it's trivially easy to add this disclosure to images, video, chatbot text. See how: https://huggingface.co/blog/watermarking-with-gradio
Thanks to the code collab in particular from @abidlabs and Yuvraj Sharma.
kanaria007 
posted an update 2 days ago
view post
Post
1788
✅ New Article: *Earth under the Cosmic Intelligence Model — Methodological Spec*

Title:
📝 CIM–Earth: A Methodology to Validate Earth Against the Cosmic Intelligence Model
🔗 https://huggingface.co/blog/kanaria007/cim-earth-spec

---

Summary:
This article is not a prediction, but a *methodological specification*.
It outlines how the *Cosmic Intelligence Model (CIM)* can be mapped onto Earth, using only structural metrics and recomputation procedures.

All numbers are placeholders — the emphasis is on reproducibility, auditability, and clarity of method.

> Not a forecast, but a framework.
> Not results, but the path to results.

---

Why It Matters:
• Demonstrates how CIM can be applied consistently to real civilizations
• Provides receiver-side recomputation rules for future empirical releases
• Keeps theory transparent, auditable, and open to refinement

---

What’s Inside:
• Recap of CIM metrics (R_A, SEV, EAI, etc.)
• Structural mapping procedure for Earth (Spec-only)
• Guidelines for provenance, rollback, and recomputation
• Why methodology matters as much as results

---

📖 Cosmic Intelligence Series — Article 3

Where the previous article solved self-reference,
this one provides a methodological foundation — ensuring future applications of CIM can be tested, verified, and audited.

---

Next: *Internal Warning — Against Theoretical Invincibility*
The following article is a cautionary reflection: even when methods are rigorous, a theory must not mistake structure for invulnerability.

> From specification to humility,
> structure reminds us that even models must remain open.
merve 
posted an update about 2 hours ago
view post
Post
75
IBM just released small swiss army knife for the document models: granite-docling-258M on Hugging Face 🔥

> not only a document converter but also can do document question answering, understand multiple languages 🤯
> best part: released with Apache 2.0 license 👏 use it with your commercial projects!
> it supports transformers, vLLM and MLX from the get-go! 🤗
> built on SigLIP2 & granite-165M

model: ibm-granite/granite-docling-258M
demo: ibm-granite/granite-docling-258m-demo 💗
merve 
posted an update 2 days ago