Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
anakin87ย 
posted an update Aug 29
Post
1656
๐Ÿ’ฌ ๐Ÿ‡ฎ๐Ÿ‡น Phi 3.5 mini ITA: a Small Language Model for Italian

Lately, I've spent some time fine-tuning language models.

Now I am happy to release Phi 3.5 mini ITA: a fine-tuned version of Phi-3.5-mini-instruct to improve performance on the Italian language

๐Ÿ”น Small (3.82 B parameters) but capable model
๐Ÿ”น 128k context length

Chat with it on ๐Ÿค— Spaces: anakin87/Phi-3.5-mini-ITA
Model card: anakin87/Phi-3.5-mini-ITA

๐Ÿ—ƒ๏ธ Data
Supervised fine-tuning using a good mix of English and Italian data:
- mlabonne/FineTome-100k by @mlabonne
- efederici/capybara-claude-15k-ita by @efederici
๐Ÿ™ Thanks to the authors for the datasets.


๐ŸŽฏ Targeted training with Spectrum
I used Spectrum, a relatively new technique for parameter-efficient learning.
The idea is to train only the layers of the model with high Signal-to-Noise Ratio (SNR) and โ„๏ธ freeze the rest.
I trained the top 30% of model layers.

๐Ÿ“ Spectrum paper: https://arxiv.org/abs/2406.06623


๐Ÿ“Š Vibe check and performance on Italian benchmarks seem encouraging
This comment has been hidden

Nice work, congrats!