Appreciation & Inquiry to StentorLabs

#1
by GODELEV - opened

To: Kai Izumoto (@StentorLabs )

Dear Kai,

I have been following your work at StentorLabs and am deeply impressed by your ability to train strong, efficient base models like Stentor3 entirely on free-tier Kaggle compute. Maximizing TPU quotas and T4 GPUs to build competitive models on a zero-dollar budget is a massive inspiration to the open-source community.

Your dedication proves that impactful AI development doesn't require a massive corporate budget.

I would love to briefly ask: what are your next future plans for StentorLabs? Are you planning to refine these hyper-efficient sub-100M architectures further, or are there new training experiments you are looking forward to?

Thank you for your incredible work, transparency, and contribution to open-source AI!

Warm regards,
Akshit

Dear Akshit,

Thank you for the kind words and support. My goal with StentorLabs is to show that capable open-source language models can be built with extremely limited resources. Over the next few years, I plan to continue developing both the Stentor family (primarily in the 10M–99M parameter range) and the Portimbria family (100M+), with a strong focus on improving efficiency rather than simply increasing parameter count. One of the biggest changes in my thinking recently is that I have largely reversed my previous position on model architecture. Earlier generations leaned toward more balanced or slightly wider designs, but after studying recent small-model research and comparing some of the strongest models in the space, I have become convinced that depth is far more important than I originally thought. As a result, future generations will move toward significantly deeper architectures. I am also investigating hybrid state-space architectures and related approaches that could make long-context training much more compute-efficient.

Looking further ahead, I hope to substantially close the performance gap between very small models and larger alternatives. While my current models are still a work in progress, my ambition is to eventually build models in the 50M-parameter range that can compete with models several times larger. I intend to keep my work open through detailed documentation, model cards, public datasets, and open weights, while maintaining my own training infrastructure and codebase. Although I do not currently publish my training code, I try to document enough of the design decisions and methodology that others can learn from the work and build upon the ideas themselves. More than anything, I hope StentorLabs can contribute useful ideas to the open-source small-language-model community and help demonstrate what independent researchers can accomplish with creativity, persistence, and efficient use of compute.

Warm regards,

Kai Izumoto
Founder, StentorLabs

Sign up or log in to comment