Brigitte Tousignant

BrigitteTousi

AI & ML interests

None yet

Recent Activity

View all activity

Organizations

Hugging Face's profile picture Society & Ethics's profile picture HuggingFaceM4's profile picture Open-Source AI Meetup's profile picture BigCode's profile picture Hugging Face OSS Metrics's profile picture IBM-NASA Prithvi Models Family's profile picture Hugging Face Smol Models Research's profile picture Wikimedia Movement's profile picture LeRobot's profile picture Journalists on Hugging Face's profile picture Women on Hugging Face's profile picture Social Post Explorers's profile picture Dev Mode Explorers's profile picture Hugging Face Science's profile picture Coordination Nationale pour l'IA's profile picture open/ acc's profile picture Bluesky Community's profile picture Sandbox's profile picture Open R1's profile picture

BrigitteTousi's activity

reacted to giadap's post with πŸ€—πŸ”₯ about 12 hours ago
view post
Post
619
πŸ€— Just published: "Consent by Design" - exploring how we're building better consent mechanisms across the HF ecosystem!

Our research shows open AI development enables:
- Community-driven ethical standards
- Transparent accountability
- Context-specific implementations
- Privacy as core infrastructure

Check out our Space Privacy Analyzer tool that automatically generates privacy summaries of applications!

Effective consent isn't about perfect policies; it's about architectures that empower users while enabling innovation. πŸš€

Read more: https://huggingface.co/blog/giadap/consent-by-design
reacted to burtenshaw's post with πŸš€ about 12 hours ago
view post
Post
480
Hacked my presentation building with inference providers, Cohere command a, and sheer simplicity. Use this script if you’re burning too much time on presentations:

πŸ”— https://github.com/burtenshaw/course_generator/blob/main/scripts/create_presentation.py

This is what it does:
- uses command a to generates slides and speaker notes based on some material.
- it renders the material in remark open format and imports all images, tables, etc
- you can then review the slides as markdown and iterate
- export to either pdf or pptx using backslide

πŸš€ Next steps are: add text to speech for the audio and generate a video. This should make Hugging Face educational content scale to a billion AI Learners.
reacted to yjernite's post with πŸ”₯ about 17 hours ago
view post
Post
1641
Today in Privacy & AI Tooling - introducing a nifty new tool to examine where data goes in open-source apps on πŸ€—

HF Spaces have tons (100Ks!) of cool demos leveraging or examining AI systems - and because most of them are OSS we can see exactly how they handle user data πŸ“šπŸ”

That requires actually reading the code though, which isn't always easy or quick! Good news: code LMs have gotten pretty good at automatic review, so we can offload some of the work - here I'm using Qwen/Qwen2.5-Coder-32B-Instruct to generate reports and it works pretty OK πŸ™Œ

The app works in three stages:
1. Download all code files
2. Use the Code LM to generate a detailed report pointing to code where data is transferred/(AI-)processed (screen 1)
3. Summarize the app's main functionality and data journeys (screen 2)
4. Build a Privacy TLDR with those inputs

It comes with a bunch of pre-reviewed apps/Spaces, great to see how many process data locally or through (private) HF endpoints πŸ€—

Note that this is a POC, lots of exciting work to do to make it more robust, so:
- try it: yjernite/space-privacy
- reach out to collab: yjernite/space-privacy
reacted to clem's post with πŸ€— about 17 hours ago
view post
Post
806
You can now bill your inference costs from all our inference partners (together, fireworks, fal, sambanova, cerebras, hyperbolic,...) to your Hugging Face organization.

Useful to drive more company-wide usage of AI without the billing headaches!
reacted to gavinkhung's post with πŸ€— about 17 hours ago
reacted to nyuuzyou's post with πŸ‘ about 17 hours ago
view post
Post
918
πŸ¦… EagleSFT Dataset - nyuuzyou/EagleSFT

Collection of 536,231 question-answer pairs featuring:

- Human-posed questions and machine-generated responses for SFT
- Bilingual content in Russian and English with linked IDs
- Derived from 739k+ real user queries, primarily educational topics
- Includes unique IDs and machine-generated category labels

This dataset provides a resource for supervised fine-tuning (SFT) of large language models, cross-lingual research, and understanding model responses to diverse user prompts. Released to the public domain under CC0 1.0 license.
reacted to JLouisBiz's post with πŸ‘€ about 17 hours ago
view post
Post
929
This is short demonstration of large language model integration into a user's workflow. This is helping to quickly save or capture whatever you have copied to your clipboard. It goes into the database. In your case, it could go to the file. It could be published quickly. You could make a one-click page or one-click document. Eventually, it becomes immediately a note for later use.

https://discord.gg/N2BRPZ2jKb

  • 1 reply
Β·
reacted to JunhaoZhuang's post with ❀️ about 17 hours ago
view post
Post
1564
We are excited to announce the release of our paper, "Cobra: Efficient Line Art COlorization with BRoAder References," along with the official code! Cobra is a novel efficient long-context fine-grained ID preservation framework for line art colorization, achieving high precision, efficiency, and flexible usability for comic colorization. By effectively integrating extensive contextual references, it transforms black-and-white line art into vibrant illustrations.

We invite you to explore Cobra and share your feedback! You can access the paper and code via the following links: [PDF](https://arxiv.org/abs/2504.12240) and [Project page](https://zhuang2002.github.io/Cobra/). We eagerly anticipate your engagement and support!

Thank you for your interest!
Β·
reacted to VolodymyrPugachov's post with πŸ”₯ about 17 hours ago
view post
Post
917
Introducing BioClinicalBERT-Triage: A Medical Triage Classification Model
I'm excited to share my latest project: a fine-tuned model for medical triage classification!
What is BioClinicalBERT-Triage?
BioClinicalBERT-Triage is a specialized model that classifies patient-reported symptoms into appropriate triage categories. Built on the foundation of emilyalsentzer/Bio_ClinicalBERT, this model helps healthcare providers prioritize patient care by analyzing symptom descriptions and medical history.
Why I Built This
As healthcare systems face increasing demands, efficient triage becomes crucial. This model aims to support healthcare professionals in quickly assessing the urgency of medical situations, particularly in telehealth and high-volume settings.
Model Performance
The model was trained on 42,513 medical symptom descriptions, using an 80:20 train/test split. After 3 epochs of training, the model achieved:

Final training loss: 0.3246
Processing speed: 13.99 samples/second

The loss steadily decreased throughout training:

Early training (epoch 0.24): 0.5796
Mid-training (epoch 1.65): 0.4308
Final (epoch 2.82): 0.3246
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

Limitations & Ethical Considerations
This model is designed to support, not replace, clinical decision-making. It should always be used under the supervision of qualified healthcare professionals. While it performs well on common presentations, it may be less accurate for rare conditions or unusual symptom descriptions.
Try It Out
I'd love to hear your feedback if you use this model in your projects! Check out the full model card here: VolodymyrPugachov/BioClinicalBERT-Triage
#medical #healthcare #bert #nlp #triage #classification
reacted to thomwolf's post with πŸ€—πŸ”₯ 3 days ago
view post
Post
4126
If you've followed the progress of robotics in the past 18 months, you've likely noticed how robotics is increasingly becoming the next frontier that AI will unlock.

At Hugging Faceβ€”in robotics and across all AI fieldsβ€”we believe in a future where AI and robots are open-source, transparent, and affordable; community-built and safe; hackable and fun. We've had so much mutual understanding and passion working with the Pollen Robotics team over the past year that we decided to join forces!

You can already find our open-source humanoid robot platform Reachy 2 on the Pollen website and the Pollen community and people here on the hub at pollen-robotics

We're so excited to build and share more open-source robots with the world in the coming months!
  • 1 reply
Β·
reacted to thomwolf's post with πŸš€β€οΈ 4 days ago
view post
Post
4126
If you've followed the progress of robotics in the past 18 months, you've likely noticed how robotics is increasingly becoming the next frontier that AI will unlock.

At Hugging Faceβ€”in robotics and across all AI fieldsβ€”we believe in a future where AI and robots are open-source, transparent, and affordable; community-built and safe; hackable and fun. We've had so much mutual understanding and passion working with the Pollen Robotics team over the past year that we decided to join forces!

You can already find our open-source humanoid robot platform Reachy 2 on the Pollen website and the Pollen community and people here on the hub at pollen-robotics

We're so excited to build and share more open-source robots with the world in the coming months!
  • 1 reply
Β·
reacted to fdaudens's post with βž•β€οΈ 10 days ago
view post
Post
3604
I read the 456-page AI Index report so you don't have to (kidding). The wild part? While AI gets ridiculously more accessible, the power gap is actually widening:

1️⃣ The democratization of AI capabilities is accelerating rapidly:
- The gap between open and closed models is basically closed: difference in benchmarks like MMLU and HumanEval shrunk to just 1.7% in 2024
- The cost to run GPT-3.5-level performance dropped 280x in 2 years
- Model size is shrinking while maintaining performance - Phi-3-mini hitting 60%+ MMLU at fraction of parameters of early models like PaLM

2️⃣ But we're seeing concerning divides deepening:
- Geographic: US private investment ($109B) dwarfs everyone else - 12x China's $9.3B
- Research concentration: US and China dominate highly-cited papers (50 and 34 respectively in 2023), while next closest is only 7
- Gender: Major gaps in AI skill penetration rates - US shows 2.39 vs 1.71 male/female ratio

The tech is getting more accessible but the benefits aren't being distributed evenly. Worth thinking about as these tools become more central to the economy.

Give it a read - fascinating portrait of where AI is heading! https://hai-production.s3.amazonaws.com/files/hai_ai_index_report_2025.pdf
Β·
reacted to jsulz's post with πŸ”₯ 11 days ago
view post
Post
2038
The Llama 4 release - meta-llama/llama-4-67f0c30d9fe03840bc9d0164 - was a big one for the xet-team with every model backed by the storage infrastructure of the future for the Hub.

It's been a wild few days, and especially 🀯 to see every tensor file with a Xet logo next to it instead of LFS.

The attached graph shows requests per second to our content-addressed store (CAS) right as the release went live.

yellow = GETs; dashed line = launch time.

You can definitely tell when the community started downloading πŸ‘€

h/t to @rajatarya for the graph, the entire Xet crew to bring us to this point, and special shoutout to Rajat, @port8080 , @brianronan , @seanses , and @znation who made sure the bytes kept flying all weekend ⚑️
  • 1 reply
Β·
posted an update 11 days ago
view post
Post
2926
AI agents are transforming how we interact with technology, but how sustainable are they? 🌍

Design choices β€” like model size and structure β€” can massively impact energy use and cost. βš‘πŸ’° The key takeaway: smaller, task-specific models can be far more efficient than large, general-purpose ones.

πŸ”‘ Open-source models offer greater transparency, allowing us to track energy consumption and make more informed decisions on deployment. 🌱 Open-source = more efficient, eco-friendly, and accountable AI.

Read our latest, led by @sasha with assists from myself + @yjernite πŸ€—
https://huggingface.co/blog/sasha/ai-agent-sustainability
  • 1 reply
Β·
reacted to merterbak's post with πŸ”₯ 12 days ago
view post
Post
2951
Meta has unveiled its Llama 4 πŸ¦™ family of models, featuring native multimodality and mixture-of-experts architecture. Two model families are available now:
ModelsπŸ€—: meta-llama/llama-4-67f0c30d9fe03840bc9d0164
Blog Post: https://ai.meta.com/blog/llama-4-multimodal-intelligence/
HF's Blog Post: https://huggingface.co/blog/llama4-release

- 🧠 Native Multimodality - Process text and images in a unified architecture
- πŸ” Mixture-of-Experts - First Llama models using MoE for incredible efficiency
- πŸ“ Super Long Context - Up to 10M tokens
- 🌐 Multilingual Power - Trained on 200 languages with 10x more multilingual tokens than Llama 3 (including over 100 languages with over 1 billion tokens each)

πŸ”Ή Llama 4 Scout
- 17B active parameters (109B total)
- 16 experts architecture
- 10M context window
- Fits on a single H100 GPU
- Beats Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1

πŸ”Ή Llama 4 Maverick
- 17B active parameters (400B total)
- 128 experts architecture
- It can fit perfectly on DGX H100(8x H100)
- 1M context window
- Outperforms GPT-4o and Gemini 2.0 Flash
- ELO score of 1417 on LMArena currently second best model on arena

πŸ”Ή Llama 4 Behemoth (Coming Soon)
- 288B active parameters (2T total)
- 16 experts architecture
- Teacher model for Scout and Maverick
- Outperforms GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM benchmarks
reacted to AdinaY's post with πŸ‘€ 17 days ago
view post
Post
2099
AutoGLM ζ²‰ζ€πŸ’« FREE AI Agent released by ZhipuAI

✨ Think & Act simultaneously
✨ Based on a fully self-developed stack: GLM-4 for general, GLM-Z1 for inference, and GLM-Z1-Rumination for rumination
✨ Will openly share these models on April 14 🀯

Preview versionπŸ‘‰ https://autoglm-research.zhipuai.cn/?channel=autoglm_android
  • 1 reply
Β·