Review
I have tested several models for three years, from the old Bert models, GPT2, through the RWKV World, until reaching the TinyLlama series, QWEN 1.5 and concluding with the recent QWEN 2 and Llama 3.2... and the closest to Perfection was this last model, but not perfect.
However, my perspective changed completely when I found THIS Llama 3.2 version, which can be said to be THE BEST Llama 3.2 • 1.24b model that I have found so far, and not only with respect to the Llama 3.2 series, even with other models of those mentioned. I have been pleasantly surprised regarding the quality and coherence of the model, almost imperceptible even in Spanish.
I want to send my congratulations and a cordial thank you to the creator who has achieved, by far, the best AI to be used locally on a cell phone with 3Gb of RAM.
Note: I don't know what the ideal inference data is for this model, but I have achieved the ideal configuration for me, which is the following:
Temp: 0.7 | Rep. Pen: 1.02 | Top. P: 0.95
Top. K: 40 | Top. A: 0.96 | Typ: 0.6 | TFS: 0.85
Min-P: 0.01 | Pr. Pen: 1.5
Seed: -1 | Rp. Range: 389
Rp. Slope: 2.5 | Smooth.F: 0.7
Thanks a million for your review, @Novaciano
This model has been completely trained using the LM-Kit Fine-tuning API, which embeds multilingual datasets.
It's clear the model produces very decent accuracy with pretty much all inference systems, though it has been optimized to be used through the LM-Kit inference system, which provides up to 10× inference speed while producing even better accuracy.
Note: I don't know what the ideal inference data is for this model
I would favor greedy sampling, as this is a simple classification task, and therefore set the temperature to 0.
All the best!
Loïc