17 10 81

Alessandro Ercolani

giux78

https://alessandroercolani.webflow.io/

AI & ML interests

NLP, Reinforcement Learning, Semantics, Computational Neuroscience

Recent Activity

liked a Space 3 days ago

evalitahf/evalita_llm_leaderboard

reacted to tomaarsen's post with 🔥 8 days ago

‼️Sentence Transformers v4.0 is out! You can now train and finetune reranker models with multi-GPU training, bf16 support, loss logging, callbacks & much more. I also prove that finetuning on your domain helps much more than you might think. 1️⃣ Reranker Training Refactor Reranker models can now be trained using an extensive trainer with a lot of powerful features: - MultiGPU Training (Data Parallelism (DP) and Distributed Data Parallelism (DDP)) - bf16 training support; loss logging - Evaluation datasets + evaluation loss - Improved callback support + an excellent Weights & Biases integration - Gradient checkpointing, gradient accumulation - Model card generation - Resuming from a training checkpoint without performance loss - Hyperparameter Optimization and much more! Read my detailed blogpost to learn about the components that make up this new training approach: https://huggingface.co/blog/train-reranker Notably, the release is fully backwards compatible: all deprecations are soft, meaning that they still work but emit a warning informing you how to upgrade. 2️⃣ New Reranker Losses - 11 new losses: - 2 traditional losses: BinaryCrossEntropy and CrossEntropy - 2 distillation losses: MSE and MarginMSE - 2 in-batch negatives losses: MNRL (a.k.a. InfoNCE) and CMNRL - 5 learning to rank losses: Lambda, p-ListMLE, ListNet, RankNet, ListMLE 3️⃣ New Reranker Documentation - New Training Overview, Loss Overview, API Reference docs - 5 new, 1 refactored training examples docs pages - 13 new, 6 refactored training scripts - Migration guides (2.x -> 3.x, 3.x -> 4.x) 4️⃣ Blogpost Alongside the release, I've written a blogpost where I finetune ModernBERT on a generic question-answer dataset. My finetunes easily outperform all general-purpose reranker models, even models 4x as big. Finetuning on your domain is definitely worth it: https://huggingface.co/blog/train-reranker See the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/v4.0.1

reacted to their post with 👀 10 days ago

This is truly an inspirational story please help us spread the word, @clem , @thomwolf and everyone who supports open source AI. A few weeks ago, @mmuffo94 and @cittiberto from indigo_ai launched the Chatbot Arena for the Italian language: https://indigo.ai/it/chatbot-arena-italia/. To our surprise, among the top-ranked models is https://huggingface.co/mii-llm/maestrale-chat-v0.4-beta a carefully fine-tuned version of https://huggingface.co/mistralai/Mistral-7B-v0.1, developed by @efederici and @mferraretto from https://huggingface.co/mii-llm, and released nearly a year ago. At this very moment, as shown in the screenshot, https://huggingface.co/mii-llm/maestrale-chat-v0.4-beta is ranked 8th right between ChatGPT-4.5 and ChatGPT-4o. It's likely that for several months, the best Italian speaking LLM has been an open source 7B model created by open source contributors and hardly anyone knew it.

View all activity

Organizations

Posts 12

Post

3155

mii-llm , and released nearly a year ago.

At this very moment, as shown in the screenshot, mii-llm/maestrale-chat-v0.4-beta is ranked 8th right between ChatGPT-4.5 and ChatGPT-4o.

It's likely that for several months, the best Italian speaking LLM has been an open source 7B model created by open source contributors and hardly anyone knew it.

View all Posts