Appreciate the model drop!
But why is it only 4k? Its 2024 man, those are rookie numbers.
Hi @Nitral-AI , thanks for your interest in these new Granite models. You're right that 4k is a short context window by today's standards. Long context is coming soon (expected to be ready by the end of the year).
Training longer context into the models is sequential, so in order to train longer context lengths, we first needed to train for shorter lengths. These 3.0
short-context models are useful for a number of use cases. We opted to release them as-is with 4k context so that users can start with them now if they fit the use case. For long context (128k
), keep an eye out for the upcoming 3.1
drop!
@Nurb4000 Yes! These models are already trained with code-centric capabilities, but due to their short context windows we don't recommend moving off of granite-code
(yet!)