Maximum input+output tokens ??
#1
by
ha1772007
- opened
Maximum input+output tokens ??
CoolSpring/Qwen2-0.5B-Abyme was trained with a sequence_len
of 4096
, while Qwen/Qwen2-0.5B-Instruct has a 32768
context length capability in the Needle in a Haystack task as per Qwen team claimed on their releasing blog post. So I would guess a number in between, leaning towards the low side.
However, it is still a guess, and personally I haven't used this model since it was done for experimental purposes. I'm happy to see you are interested in my created model, please take care!