Default ExllamaV2 Dataset Alternative

#1
by AliCat2 - opened

Hello!

Do you have any plans on doing any future models with the default ExllamaV2 dataset instead or rpcal? When doing tests comparing the same model and the same bpw, the models using rpcal performed consistently worse on mixtral models. Of course, there's always some room for subjectivity, but could be something worth checking since the difference was noticeable.

Maybe not for this specific model, but more so something to keep in mind! (This model is a little too censored for my tastes, but someone else may appreciate the alternative quant).

Hello!

Thanks for the feedback! Can you share the tests you used, so that I can experiment with various quantization datasets and settings? A problem with the default quantization dataset is that it is too short to quantize with the settings that I use (8192 row length) and I suspect that this is important for making sure that the model performs well with a longer context. I've had several people comment that my -rpcal models perform better when the context exceeds a certain length. I may experiment with an expanded default dataset, that would be long enough and also include some RP data.

Thanks for the update! I'll use that information to expand on the testing appropriately! For the tests that were performed, they were performed with a context size of 4096, with 3.75bpw mixtral models, so I didn't notice the issues others were referring to. I'll aim the next set of testing towards 8192, instead.

Sign up or log in to comment