How ti train sparse model like those

#1
by mobinx - opened

Guys, how can i convert or fine tune existing model to work with powerinfer.How do i add sparsity in models like you guys do with this model

PowerInfer org

Well, the HF models are available at 7B and 13B. For more details for training the sparse model, refer to our newly released paper.
If you are curious about how to convert HF models into GGUF formats to work with PowerInfer, you should first train activation predictors (See sample codes). Next, follow the instructions at PowerInfer for file conversions.

Huge thanks for your fast reply.What a great team! @Raincleared can you please provide little more detail on conversation of HF models into powerinfer gguf. Like how to convert a totally fresh llama like model (eg mistral) into gguf model?
As far as I understand, I could train the activation predictor using the repository u provided earlier but how can i make Relu model of (mistral which u guys grab from SparseLLM) and it will be good if you provide step by step answer if u have time. Thnx in advance. I really want to contribute the powerinfer community.

PowerInfer org
  1. The first step of converting Swish-activated models (e.g., Mistral-7B) into ReLU-activated models is too complicated to describe in summary, which we strongly recommend reading our paper.
  2. Prepare a target dataset in JSONL format where you want to run your model. This corresponds to the file /home/jeeves/sparse_test_data.jsonl in the following scripts.
  3. The third step is to obtain data for training the activation predictor for a specific ReLU-activated model. You may follow get_llama_data.py. Remember to specify the model_path and data_path.
  4. The fourth step is to train the activation predictors. You can follow the script run_c4_mlp.py. Remember to specify the model_name, model, and data_path.
  5. After you obtain the predictors, please follow the Convert from Original Model Weights + Predictor Weights section of README in PowerInfer to obtain the final GGUF files.

Oh my goodness, thnx for fast response. I will give a try to convert mistral into powerinfer format by following your advice.Again your research is something so cool and opens new door for local llm. Love the community of powerinfer.

Raincleared changed discussion status to closed

Sign up or log in to comment