Has anybody measured the impact on effective context size (cf RULER ) ? This model could have a lot of potential for RAG fine tuning if effective context size is not reduced too much imho.
Best Regards
Your need to confirm your account before you can post a new comment.