Alessandro Ercolani

giux78

AI & ML interests

NLP, Reinforcement Learning, Semantics, Computational Neuroscience

Articles

Organizations

giux78's activity

posted an update 23 days ago
view post
Post
1726
๐ŸŽ‰ Super @DeepMount00 just released ๐—š๐—ฒ๐—บ๐—บ๐—ฎ_๐—ค๐—”_๐—œ๐—ง๐—”_๐˜ƒ๐Ÿฏ ๐—น๐—ฒ๐—ฎ๐—ฑ๐—ถ๐—ป๐—ด the ๐—ฅ๐—”๐—š ๐˜๐—ฎ๐˜€๐—ธ on the Italian ๐—Ÿ๐—Ÿ๐— _๐—œ๐—ง๐—”_๐—Ÿ๐—˜๐—”๐——๐—˜๐—ฅ๐—•๐—ข๐—”๐—ฅ๐——. The model is a fine tuned version of Gemma 2B.
Model details: DeepMount00/Gemma_QA_ITA_v3
Explore the full RAG section rankings here: FinancialSupport/open_ita_llm_leaderboard on section Classifica RAG
posted an update about 1 month ago
view post
Post
1749
On evaluating fine tuned 7B Italian open source LLMs I have collected many data points and I created a super simple explorative analyses. My hypothesis based on data are:

- mmlu is hard to improve when fine tuning a base model on a different language
- fine tuning also on single GPUs can improve by 5% to 10% the base model on common tasks but a lot more on specific cases with the right training time and data
- fine tuning can specialize well but at cost of loosing some foundational knowledge.

Here the data https://docs.google.com/spreadsheets/d/1MBcxy1loK8eIycZG4DN84Q2ejZ0jSjxUBgoShHDR6IY/edit?usp=sharing
Here the colab https://colab.research.google.com/drive/1ra4_skG5QYWSYOzvagOoIoj4bibQD8Gw?usp=sharing
Here an article with some considerations https://medium.com/@giuxale/an-analyses-on-italian-llms-models-evaluations-51bffe1d44d1

replied to osanseviero's post about 1 month ago
posted an update about 1 month ago
view post
Post
1278
Based on the work of @mrinaldi and @ruggsea we just released the biggest - ready for training - conversational dataset based on Usenet data in the Italian language ๐Ÿ‡ฎ๐Ÿ‡น๐Ÿ‡ฎ๐Ÿ‡น๐Ÿ‡ฎ๐Ÿ‡น๐Ÿ‡ฎ๐Ÿ‡น๐Ÿ‡ฎ๐Ÿ‡น๐Ÿ‡ฎ๐Ÿ‡น๐Ÿ‡ฎ๐Ÿ‡น. It contains about 9 millions of conversations made by real humans.

mii-community/UsenetArchiveIT-conversations
replied to their post about 1 month ago
view reply

@clefourrier sometimes rendering the data took lot of time as in the pic attached. But data are in a static a file inside the app. Is there a way to improve? FYI we are on the free tier.

Screenshot 2024-03-26 at 12.59.10.png

replied to their post about 1 month ago
posted an update about 1 month ago
posted an update about 2 months ago
view post
Post
Wonderful open source Italian dataset from @manalog and @ruggsea :

https://huggingface.co/datasets/manalog/UsenetArchiveIT

The dataset contributes to the https://huggingface.co/mii-community project, aimed at advancing the creation of Italian open-source Language Models (LLMs).๐Ÿ‡ฎ๐Ÿ‡น ๐Ÿค– About 10-20 billion token, probably the best conversational open source dataset in the Italian language. ๐Ÿ‡ฎ๐Ÿ‡น๐Ÿ‡ฎ๐Ÿ‡น๐Ÿ‡ฎ๐Ÿ‡น๐Ÿ‡ฎ๐Ÿ‡น๐Ÿ‡ฎ๐Ÿ‡น๐Ÿ‡ฎ๐Ÿ‡น๐Ÿ‡ฎ๐Ÿ‡น
  • 2 replies
ยท
replied to their post about 2 months ago
posted an update about 2 months ago
view post
Post
Super work from @DeepMount00 :

๐Ÿš€ ๐ƒ๐ข๐ฌ๐œ๐จ๐ฏ๐ž๐ซ ๐”๐ง๐ข๐ฏ๐ž๐ซ๐ฌ๐š๐ฅ ๐๐ž๐ซ: ๐€ ๐†๐ฅ๐ข๐๐ž๐ซ-๐๐š๐ฌ๐ž๐ ๐ˆ๐ญ๐š๐ฅ๐ข๐š๐ง ๐๐„๐‘

Introducing ๐”๐ง๐ข๐ฏ๐ž๐ซ๐ฌ๐š๐ฅ ๐๐ž๐ซ ๐Ÿ๐จ๐ซ ๐ˆ๐ญ๐š๐ฅ๐ข๐š๐ง ๐‹๐š๐ง๐ ๐ฎ๐š๐ ๐ž, a revolutionary Named Entity Recognition (NER) model evolved from the GliNer architecture and meticulously tailored for the Italian language. This advanced model is a beacon of efficiency and versatility, engineered to ๐ซ๐ž๐œ๐จ๐ ๐ง๐ข๐ณ๐ž ๐š๐ง๐ฒ ๐ž๐ง๐ญ๐ข๐ญ๐ฒ ๐ญ๐ฒ๐ฉ๐ž within the rich nuances of Italian, using a bidirectional transformer encoder. It stands out as an ideal solution for those navigating the challenges of resource-limited environments or seeking an efficient alternative to the cumbersome Large Language Models (LLMs).
๐‘๐ฎ๐ง๐ฌ ๐Ÿ๐š๐ฌ๐ญ ๐š๐ฅ๐ฌ๐จ ๐จ๐ง ๐‚๐๐”!

Experience this Italian-focused innovation live on Hugging Face Spaces:
DeepMount00/universal_ner_ita

Paper: https://arxiv.org/abs/2311.08526 Urchade Zaratiana et all. great work!
ยท
posted an update about 2 months ago