Hugging Face – Posts

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

All HF Hub posts

MonsterMMORPG

posted an update about 1 hour ago

Post

The IDM-VTON (Improving Diffusion Models for Authentic Virtual Try-on in the Wild) is so powerful that it can even transfer beard or hair as well.

I have prepared installer scripts and full tutorials for Windows (requires min 8 GB VRAM GPU), Massed Compute (I suggest this if you don’t have a strong GPU), RunPod and a free Kaggle account (works perfect as well but slow).

Windows Tutorial : https://youtu.be/m4pcIeAVQD0

Cloud (Massed Compute, RunPod & Kaggle) Tutorial : https://youtu.be/LeHfgq_lAXU

qq8933

posted an update about 4 hours ago

Post

292

ChemLLM-20B SFT and DPO is coming!🤗

fdaudens

posted an update about 7 hours ago

Post

546

A new dataset for anyone interested in Satellite imagery: 3 million @Satellogic images of unique locations — 6 million images, including location revisits — from around the world under a Creative Commons CC-BY 4.0 license.

Interesting potential in journalism.

satellogic/EarthView

georgewritescode

posted an update about 7 hours ago

Post

485

Excited to bring our benchmarking leaderboard of >100 LLM API endpoints to HF!

Speed and price are often just as important as quality when building applications with LLMs. We bring together all the data you need to consider all three when you need to pick a model and API provider.

Coverage:
‣ Quality (Index of evals, MMLU, Chatbot Arena, HumanEval, MT-Bench)
‣ Throughput (tokens/s: median, P5, P25, P75, P95)
‣ Latency (TTFT: median, P5, P25, P75, P95)
‣ Context window
‣ OpenAI library compatibility

Link to Space: ArtificialAnalysis/LLM-Performance-Leaderboard

Blog post: https://huggingface.co/blog/leaderboard-artificial-analysis

davanstrien

posted an update about 10 hours ago

Post

596

Only 14 languages have DPO preference style datasets on the Hugging Face Hub ( DIBT/preference_data_by_language) Let's improve that! How?

The Cohere For AI Aya dataset CohereForAI/aya_dataset has human-annotated prompt-completion pairs in 71 languages. We can use this to create DPO datasets for more languages!

Using Aya's prompt/response pairs as a starting point we can use an LLM to generate an additional response to each prompt. We then use an LLM Judge to rank each response.

✅ In some/many languages, human responses may be better than LLM ones but we may want to check that assumption for some languages.
🚀 We use Argilla's distilabel library to push data to Argilla for validation. This also allows us to determine if an LLM judge is effective for different languages.

As an example of what this pipeline produces:
- DIBT/aya_dutch_dpo a DPO style dataset for Dutch using Llama 3 as a generator/judge LM.
- An annotation Space that anyone with a HF account can contribute to: https://dibt-demo-argilla-space.hf.space/dataset/924ef8a8-a447-4563-8806-0e2a668a5314/annotation-mode?page=1&status=pending

As part of Data is Better Together we want to build more DPO datasets. Join us here: https://github.com/huggingface/data-is-better-together#4-dpoorpo-datasets-for-more-languages 🤗

abhishek

posted an update about 11 hours ago

Post

706

🚀🚀🚀🚀 Introducing AutoTrain Configs! 🚀🚀🚀🚀
Now you can train models using yaml config files! 💥 These configs are easy to understand and are not at all overwhelming. So, even a person with almost zero knowledge of machine learning can train state of the art models without writing any code. Check out example configs in the config directory of autotrain-advanced github repo and feel free to share configs by creating a pull request 🤗
Github repo: https://github.com/huggingface/autotrain-advanced

1 reply

phenixrhyder

posted an update about 12 hours ago

Post

549

May the fourth be with you... Ai yoda #starwarsday

gsarti

posted an update about 15 hours ago

Post

809

🔍 Today's (self-serving) pick in Interpretability & Analysis of LMs:

A Primer on the Inner Workings of Transformer-based Language Models
by @javifer @gsarti @arianna-bis and M. R. Costa-jussà
( @mt-upc , @GroNLP , @facebook )

This primer can serve as a comprehensive introduction to recent advances in interpretability for Transformer-based LMs for a technical audience, employing a unified notation to introduce network modules and present state-of-the-art interpretability methods.

Interpretability methods are presented with detailed formulations and categorized as either localizing the inputs or model components responsible for a particular prediction or decoding information stored in learned representations. Then, various insights on the role of specific model components are summarized alongside recent work using model internals to direct editing and mitigate hallucinations.

Finally, the paper provides a detailed picture of the open-source interpretability tools landscape, supporting the need for open-access models to advance interpretability research.

📄 Paper: A Primer on the Inner Workings of Transformer-based Language Models (2405.00208)

🔍 All daily picks: https://huggingface.co/collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9

Taylor658

posted an update about 20 hours ago

Post

1090

The Open Medical-LLM Leaderboard is now up on HF Spaces. 🤗

openlifescienceai/open_medical_llm_leaderboard

It will be interesting to add the results of the just announced Med-Gemini model to the Leaderboard to see how it compares and if its stated 91.1% MedQA benchmark is accurate.

Capabilities of Gemini Models in Medicine (2404.18416)

1 reply

fdaudens

posted an update about 22 hours ago

Post

1045

I've added new collections to the Journalists on 🤗 community, focusing on Data Visualization, Optical Character Recognition, and Multimodal Models:

- TinyChart-3B: This model interprets data visualizations based on your prompts. It can generate the underlying data table from a chart or recreate the chart with Python code.
- PDF to OCR: Convert your PDFs to text—ideal for FOI records sent as images.
- Idefics-8b: A multimodal model that allows you to ask questions about images.

Explore these tools here: 👉 https://huggingface.co/JournalistsonHF

Recently active users