Generation time for 128-latest

#13
by Kurapika993 - opened

The generation time for 128 latest is too long, do you have an idea why ? Also, do you have data about the trade-off in accuracy for these branches ?

No I'm not sure why 128 latest would be slower.

With regard to accuracy - I will have these figures soon. I am doing a comprehensive series of perplexity benchmarks, with every permutation of parameter, using the new AutoGPTQ repo which will be the future of GPTQ.

I plan to post the results in an Issue in AutoGPTQ on Github within the next 24-48 hours, so check for it there.

Kurapika993 changed discussion status to closed

Sign up or log in to comment