Update README.md
Browse files
README.md
CHANGED
@@ -13,10 +13,23 @@ tags:
|
|
13 |
This is an experimental model.
|
14 |
|
15 |
The idea is :
|
16 |
-
- Calculate the difference in weights between a donor model(meta-math/MetaMath-Mistral-7B) and the base model(mistralai/Mistral-7B-v0.1). This difference represents how much each parameter needs to be adjusted to go from the base state to the donor state.
|
|
|
|
|
|
|
|
|
|
|
17 |
- Vector retrieved from the result of step one, is added to third model(lex-hue/Delexa-7b). This should transfer **math** *skills* to our third model.
|
18 |
|
19 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
21 |
import torch
|
22 |
|
|
|
13 |
This is an experimental model.
|
14 |
|
15 |
The idea is :
|
16 |
+
- Calculate the difference in weights between a donor model(meta-math/MetaMath-Mistral-7B) and the base model(mistralai/Mistral-7B-v0.1). This difference represents how much each parameter needs to be adjusted to go from the base state to the donor state.
|
17 |
+
|
18 |
+
```
|
19 |
+
vector = math_model.state_dict()[k] - base_model.state_dict()[k]
|
20 |
+
```
|
21 |
+
|
22 |
- Vector retrieved from the result of step one, is added to third model(lex-hue/Delexa-7b). This should transfer **math** *skills* to our third model.
|
23 |
|
24 |
```
|
25 |
+
vector = math_model.state_dict()[k]
|
26 |
+
new_v = v + vector.to(v.device)
|
27 |
+
v.copy_(new_v)
|
28 |
+
```
|
29 |
+
|
30 |
+
### Example:
|
31 |
+
|
32 |
+
```
|
33 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
34 |
import torch
|
35 |
|