Exceptional creative writer
Although it was not designed with this in mind, I enjoy testing models of all sizes for creative writing, and frankly, Granite 3.1 is easily one of the best I have seen yet. It may depend on the particular circumstance, but as an example based on a prompt that started with: "write a description for a role-playing game in which the player...." Here is a sample excerpt from our game:
"As you regain your composure deep within the cavern's winding corridors, the echo of goblin shouts begins to fade into distant echoes. You find yourself in a vast chamber that stretches farther than your torch can illuminate, its ceiling disappearing into darkness overhead. Stalactites hang menacingly from above like jagged teeth, while stalagmites rise like ancient sentinels guarding the floor below.
The air here is stale and heavy with the scent of damp earth mixed with something else - a faint, sweet aroma that you can't quite place. It's as if the very stone itself exudes an otherworldly perfume.
Turning your attention to the stolen loot spilled onto a flat rock, you marvel at its diversity and worth: A handful of gold coins gleam with an almost liquid luminescence under your torchlight; rubies, sapphires, and emeralds catch the firelight, their facets sparkling like stars trapped within gems. Among these precious stones lies a small, intricately carved wooden box adorned with silver filigree - clearly more than meets the eye."
Really impressive.
In which business context did you use this prompt, @mayank-mishra ? Just joking, obviously!
But I'm impressed as well with the quality of the output for such a small model!
I'm a writer by craft, so I'm necessarily sensitive to certain details and characteristics in prose. Plenty of LLMs are extremely proficient in detailed descriptions, and even varied use of vocabulary. This model stands out for its consistent use of metaphors in its descriptions, with no prodding or prompting (pun intended) on my part, and frankly suggests (to my innocent eyes) that the content it was trained on and is modeling was very well-curated. Samples such as:
"Stalactites hang menacingly from above like jagged teeth, while stalagmites rise like ancient sentinels guarding the floor below."
or
"As you cautiously venture into this dimly lit cave, your senses are immediately awakened by the oppressive silence - a stark contrast to the symphony of rustling leaves and chirping insects that greeted you just moments ago."
are definitely atypical. I'm loving it for sure, and wonder what would result from a larger model trained like this. Hint, hint. :-) Regardless, a massive thanks and congrats to the team behind it.
This model has shot way up on my list of models, it is solid and accurate, no extra stuff I dont need. Enterprise users will be happy with it.
Yeah, a lot of recent LLMs score suspiciously high on tests, but are extremely weak in a lot of areas.
For example, Qwen2.5 7b, EXAONE3.5 7b, Falcon3 7b... all score around 35 on the MMLU pro but are absurdly ignorant across all popular domains of human knowledge. And although Granite 3.1 7b doesn't match the broad knowledge of Llama 3.1 8b and Gemma2 9b, it's at least in the ball park and has vastly more world knowledge than the previously listed models.
So I agree, IBM (at least for now) is trying to make a strong general purpose LLM rather than just selectively training for math, coding, and what's covered by the standardized tests.
Totally agree with @phil111 here. You guys should try the new 2b granite model too, it's also very good.