OPEA
/

cicdatopea commited on
Commit
454e3b7
·
verified ·
1 Parent(s): 7382bd7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -0
README.md CHANGED
@@ -85,8 +85,30 @@ Please follow the [Build llama.cpp locally](https://github.com/ggerganov/llama.c
85
 
86
  **5*80G gpu is needed(could optimize), 1.4T cpu memory is needed**
87
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
88
  pip3 install git+https://github.com/intel/auto-round.git
89
 
 
90
  ```python
91
  import torch
92
  from transformers import AutoModelForCausalLM, AutoTokenizer
 
85
 
86
  **5*80G gpu is needed(could optimize), 1.4T cpu memory is needed**
87
 
88
+ **1 add meta data to bf16 model** https://huggingface.co/opensourcerelease/DeepSeek-V3-bf16
89
+
90
+ ```python
91
+ import safetensors
92
+ from safetensors.torch import save_file
93
+
94
+ for i in range(1, 164):
95
+ idx_str = "0" * (5-len(str(i))) + str(i)
96
+ safetensors_path = f"model-{idx_str}-of-000163.safetensors"
97
+ print(safetensors_path)
98
+ tensors = dict()
99
+ with safetensors.safe_open(safetensors_path, framework="pt") as f:
100
+ for key in f.keys():
101
+ tensors[key] = f.get_tensor(key)
102
+ save_file(tensors, safetensors_path, metadata={'format': 'pt'})
103
+ ```
104
+
105
+ **2 replace the modeling_deepseek.py with the following file**, basically align device and remove torch.no_grad as we need some tuning in AutoRound.
106
+
107
+ https://github.com/intel/auto-round/blob/deepseekv3/modeling_deepseek.py
108
+
109
  pip3 install git+https://github.com/intel/auto-round.git
110
 
111
+ **3 tuning**
112
  ```python
113
  import torch
114
  from transformers import AutoModelForCausalLM, AutoTokenizer