higher context with alpha_value=2.5

#1
by mclassHF2023 - opened

I tried this with alpha_value=2.5 (at 5_0 bit) and it seems to work surprisingly well at 16k context! So far my favorite Llama3 model!

Sign up or log in to comment