Need to set scale to 0.25 in `config.json`?

#1
by jahhs0n - opened

I notice the max_position_embeddings in config.json has already been set to 8192, but there are no scale variable in config.json, thus it will be default set to 1 as per the modelling_llama.py in this repo. Do we need to add a scale variable in the config.json and set it to 0.25 for the model to work in 8K context?

Yeah I defaulted the max_position_embeddings to 8192 then the code that's run by trust_remote_code=True will then set scale to 4 automatically. If you edited max positions to 4K, it would use 2 instead.

So no you don't need to set a scale param, just edit max_position_embeddings according to what you want, and the customised Llama modelling code will figure it out. Just make sure to set trust_remote_code=True

Sorry I should have mentioned that in the fp16 READMEs - it is described in my GPTQ readmes but I didn't put it in the fp16s.

No worries, thanks for the clarifications! I just saw the code implementation where the scale is calculated from the max_position_embeddings.

jahhs0n changed discussion status to closed

Sign up or log in to comment