Question about the quantization parameters of this model

#3
by pytokusu - opened

Hi, Neody!

I have a question for you about the quantization parameters of this model. Did you use some alternative calibration for this model or not? Or something non-standard.

I have provisionally taken 4th place with this model in the Kaggle AIMO-2 math competition. There were 2355 teams in the competition. The team composition varied from one to eight people. I did not have a team. The competition lasted 5 months.

I am writing an article about my program and how I came to this decision. In my program, in order to fit into the allotted time of 50 problems in 5 hours on the L4x4 map, I used only 2 received answers if they matched. This way I was able to fit into the allotted time and solve 29 problems out of 50.

I did the quantization of this model myself, but my best result was 28 solved problems. If you can answer what you did that was so unique that your model worked better than the model from Casper Hansen.

Thanks in advance
With respect, an admirer of your models.
If this is your know-how and a big secret, tell me, I will not be offended at you.

neodyland org

Unfortunately, for quantization what I did was follow this https://github.com/casper-hansen/AutoAWQ/blob/main/examples/quantize.py script, which is not special.
Maybe it might be this:

"add_bos_token": false,

for tokenizer config,
and some bfloat16 weights which I don't remember how I did.

If you have more questions, feel free to ask! I'm happy that this model helped someone in a Kaggle competition!

Neody, thanks for the reply! I used the same script too. It turns out that the only real difference so far is this. "add_bos_token": false. Anyway, thank you very much for your reply! Keep making such great models! Good luck to you!

pytokusu changed discussion status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment