Thomas Wolf PRO

thomwolf

AI & ML interests

NLP and open-source :-)

Recent Activity

Articles

Organizations

Hugging Face's profile picture Natural Language Processing with Transformers's profile picture BigScience Workshop's profile picture Flax Community's profile picture datablations's profile picture Training Transformers Together's profile picture BigScience Data's profile picture Evaluation datasets's profile picture HuggingFaceBR4's profile picture Godot Engine Demos's profile picture OpenAssistant's profile picture Evaluation on the Hub's profile picture HuggingFaceM4's profile picture Simulation Environments Tests and Builds's profile picture (De)fusing's profile picture HuggingFaceGECLM's profile picture CodeParrot's profile picture BigCode's profile picture Hugging Face H4's profile picture CV as NLP's profile picture Explorer of Simulate alpha's profile picture BigCode Data's profile picture Hugging Face Extreme-Scale's profile picture Hugging Face H4 Community's profile picture GAIA's profile picture Hugging Face TB Research's profile picture Hugging Face Smol Cluster's profile picture Open LLM Leaderboard's profile picture TTS Eval (OLD)'s profile picture the circle of truth - war scene's profile picture Nanotron Research's profile picture LeRobot's profile picture Journalists on Hugging Face's profile picture MLX Community's profile picture Hugging Face Assignments's profile picture HuggingFaceFW's profile picture TTS AGI's profile picture Social Post Explorers's profile picture dora-rs's profile picture HuggingFaceEval's profile picture HuggingFaceFW-Dev's profile picture Hugging Face Discord Community's profile picture DataComp 's profile picture Data Agents's profile picture Hugging Face FineVideo's profile picture HuggingFace Science Team's profile picture Art's profile picture smol-explorers's profile picture Hugging Face Science's profile picture LeMaterial's profile picture open/ acc's profile picture

thomwolf's activity

Reacted to merve's post with πŸ”₯πŸ‘ about 10 hours ago
view post
Post
1277
The authors of ColPali trained a retrieval model based on SmolVLM 🀠 vidore/colsmolvlm-alpha
TLDR;

- ColSmolVLM performs better than ColPali and DSE-Qwen2 on all English tasks

- ColSmolVLM is more memory efficient than ColQwen2 πŸ’—
New activity in argilla/synthetic-data-generator about 11 hours ago

Some notes/feedback

1
#15 opened about 11 hours ago by thomwolf
Reacted to davanstrien's post with ❀️ 2 days ago
view post
Post
2127
First dataset for the new Hugging Face Bluesky community organisation: bluesky-community/one-million-bluesky-posts πŸ¦‹

πŸ“Š 1M public posts from Bluesky's firehose API
πŸ” Includes text, metadata, and language predictions
πŸ”¬ Perfect to experiment with using ML for Bluesky πŸ€—

Excited to see people build more open tools for a more open social media platform!
Reacted to ZennyKenny's post with πŸ‘ 4 days ago
view post
Post
1156
I've joined the Bluesky community. Interested to see what decentralized social media looks like in action: https://bsky.app/profile/kghamilton.bsky.social

Looking forward to following other AI builders, tech enthusiasts, goth doomscrollers, and ironic meme creators.
Reacted to as-cle-bert's post with πŸ”₯ 4 days ago
view post
Post
1204
Hi HuggingFacers!πŸ€—
I'm thrilled to introduce my latest project: π—¦π—²π—»π—§π—Ώπ—˜π˜ƒ (𝗦𝗲𝗻tence 𝗧𝗿ansformers π—˜π˜ƒaluator), a python package that offers simple customizable evaluation for text retrieval accuracy and time performance of Sentence Transformers-compatible text embedders on PDF data!πŸ“Š

Learn more in my LinkedIn post: https://www.linkedin.com/posts/astra-clelia-bertelli-583904297_python-embedders-semanticsearch-activity-7266754133557190656-j1e3

And on the GitHub repo: https://github.com/AstraBert/SenTrEv

Have fun!πŸ•
posted an update 5 days ago