Seeking information about smashing

#2
by ewre324 - opened

Hello, to be frank I am quite impressed by your work. Can you please point me to resources to understand smashing in detail?

Pruna AI org

I am happy that you like what we do :) Smashing refers to the application of any (potentially combination of ) compression methods to a ML model in the context of Pruna. It could include quantization, pruning, compilation, or many other compression methods. In each of the model page, we provide a smash_config.json which details the parameters used for the compression of the model. We also constantly try to update our documentation here. E.g, for this model thee use llm-int8 quantization from the great bitsandbytes (bnb) to compress the model.

sharpenb changed discussion status to closed
This comment has been hidden

Sorry wrong thread

Sign up or log in to comment