HuggingFaceTB/SmolLM-360M-Instruct-WebGPU · It goes crazy when greeted

Jul 17

bh4

Jul 17

With a proper prompt like "Define Supervised, Unsupervised and Reinforcement learning with a suitable example of each. " this produces much better results than the 135m model(which produced gibberish). However, the result of this prompt as well as many other prompts are much worse than the result produced by qwen2-0.5b. Also, qwen2 is multilingual while this is English only. For on-device inference, I will surely go with qwen2-0.5b as of today.

Dampfinchen

Jul 17

Yeah this model is not usable even considering the size. What happened here?

CodeMaahir

Jul 25

It is evident that this model is not usable yet.

loubnabnl

Hugging Face TB Research org Aug 18

Hi, we just updated the Instruct Models and the outputs should be better, give it a try!

We also have an instant single turn demo here: https://huggingface.co/spaces/HuggingFaceTB/instant-smollm

SaisExperiments

Aug 18

Outputs are much better, it has a little trouble following multi step instructions but, it is a tiny model after all
It can also reach 170t/s processing and 21T/s generation on a pixel 7a through Llama.cpp (360M)
It's for some reason way faster on my older phone, reaching 36t/s generation through Llama.cpp (360M)
And manages to respond better to some questions than Qwen2-0.5B which thinks the best viewing angle for a rainbow is with a magnifying glass?
SmolLM is also quite confident in it's answers compared to Qwen