about the "model_max_length": 16384

#11
by AlexWuKing - opened

the original model_max_length of the Qwen/Qwen2.5-7B-Instruct is 131072
but in this distill model deepseek-ai/DeepSeek-R1-Distill-Qwen-7B, it is set to 16384

i wonder why we are doing
Screenshot 2025-03-12 at 2.24.47 AM.png
Screenshot 2025-03-12 at 2.25.04 AM.png
this?

Sign up or log in to comment