Simon Brandeis's picture

Simon Brandeis

sbrandeis

AI & ML interests

None yet

Recent Activity

Articles

Organizations

Hugging Face's profile picture AutoNLP's profile picture 21 RNN's profile picture BigScience Workshop's profile picture Webhooks Explorers (BETA)'s profile picture 2023 Jan Offsite hackathon's profile picture Huggingface Projects's profile picture Language Tools's profile picture SB Labs's profile picture Enterprise Explorers's profile picture ZeroGPU Explorers's profile picture Social Post Explorers's profile picture Dev Mode Explorers's profile picture Test's profile picture Test Org AWS's profile picture Mt Metrics's profile picture

sbrandeis's activity

reacted to julien-c's post with πŸ‘ 15 days ago
view post
Post
7611
After some heated discussion πŸ”₯, we clarify our intent re. storage limits on the Hub

TL;DR:
- public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible
- private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)

docs: https://huggingface.co/docs/hub/storage-limits

We optimize our infrastructure continuously to scale our storage for the coming years of growth in Machine learning, to the benefit of the community πŸ”₯

cc: @reach-vb @pierric @victor and the HF team
Β·
New activity in huggingface/documentation-images 4 months ago
New activity in dev-mode-explorers/README 5 months ago
New activity in meta-llama/Llama-3.1-8B-Instruct 5 months ago

Add `base_model` metadata

#72 opened 5 months ago by
sbrandeis
New activity in dev-mode-explorers/README 6 months ago

LL-MWA-4o

1
#8 opened 7 months ago by
SitonmyFACEBOOK

Upload app.py

1
#11 opened 6 months ago by
waqashayder
upvoted an article 6 months ago
reacted to louisbrulenaudet's post with ❀️ 6 months ago
view post
Post
3233
I am delighted to announce the publication of my LegalKit, a French labeled dataset built for legal ML training πŸ€—

This dataset comprises multiple query-document pairs (+50k) curated for training sentence embedding models within the domain of French law.

The labeling process follows a systematic approach to ensure consistency and relevance:
- Initial Query Generation: Three instances of the LLaMA-3-70B model independently generate three different queries based on the same document.
- Selection of Optimal Query: A fourth instance of the LLaMA-3-70B model, using a dedicated selection prompt, evaluates the generated queries and selects the most suitable one.
- Final Label Assignment: The chosen query is used to label the document, aiming to ensure that the label accurately reflects the content and context of the original text.

Dataset: louisbrulenaudet/legalkit

Stay tuned for further updates and release information πŸ”₯

@clem , if we can create an "HF for Legal" organization, similar to what exists for journalists, I am available!

Note : My special thanks to @alvdansen for their illustration models ❀️
  • 2 replies
Β·
reacted to Xenova's post with πŸ”₯ 6 months ago
view post
Post
6010
Florence-2, the new vision foundation model by Microsoft, can now run 100% locally in your browser on WebGPU, thanks to Transformers.js! πŸ€—πŸ€―

It supports tasks like image captioning, optical character recognition, object detection, and many more! 😍 WOW!
- Demo: Xenova/florence2-webgpu
- Models: https://huggingface.co/models?library=transformers.js&other=florence2
- Source code: https://github.com/xenova/transformers.js/tree/v3/examples/florence2-webgpu