PrunaAI/Locutusque-TinyMistral-248M-bnb-8bit-smashed · Seeking information about smashing

Apr 18, 2024

Hello, to be frank I am quite impressed by your work. Can you please point me to resources to understand smashing in detail?

sharpenb

Pruna AI org Apr 19, 2024

I am happy that you like what we do :) Smashing refers to the application of any (potentially combination of ) compression methods to a ML model in the context of Pruna. It could include quantization, pruning, compilation, or many other compression methods. In each of the model page, we provide a smash_config.json which details the parameters used for the compression of the model. We also constantly try to update our documentation here. E.g, for this model thee use llm-int8 quantization from the great bitsandbytes (bnb) to compress the model.

sharpenb changed discussion status to closed Apr 19, 2024

owao

Apr 25, 2024

This comment has been hidden

owao

Apr 25, 2024

Sorry wrong thread