TehVenom commited on
Commit
3cf067a
1 Parent(s): eb010f1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -6,7 +6,7 @@ inference: false
6
  ---
7
  # GPT-J 6B - PPO_Pygway Mix
8
  ## Model description
9
- This is a a merged model, using an averaged weights strategy at a (20:20:60) ratio between the models:
10
 
11
  - [20%] - KoboldAI/GPT-J-6B-Janeway: https://huggingface.co/KoboldAI/GPT-J-6B-Janeway
12
  - [20%] - reciprocate/ppo_hh_gpt-j: https://huggingface.co/reciprocate/ppo_hh_gpt-j
@@ -36,7 +36,8 @@ PPO_Pygway combines `ppo_hh_gpt-j`, `Janeway-6b` and `Pygmalion-6b`; all three m
36
  (X*A + Y*B)
37
  ```
38
  With X & Y being the model weighs, and A/B being how strongly they are represented within the final value.
39
- The intent of this is to elevate the end-model by borrowing the strongly represented aspects out of each base model.
 
40
 
41
  Blend was done in FP32 and output saved in FP16 for reduced storage needs.
42
 
 
6
  ---
7
  # GPT-J 6B - PPO_Pygway Mix
8
  ## Model description
9
+ This is a a merged model, using an weighted parameter blend strategy at a (20:20:60) ratio between the models:
10
 
11
  - [20%] - KoboldAI/GPT-J-6B-Janeway: https://huggingface.co/KoboldAI/GPT-J-6B-Janeway
12
  - [20%] - reciprocate/ppo_hh_gpt-j: https://huggingface.co/reciprocate/ppo_hh_gpt-j
 
36
  (X*A + Y*B)
37
  ```
38
  With X & Y being the model weighs, and A/B being how strongly they are represented within the final value.
39
+ The intent of this is to elevate the end-model by borrowing the strongly represented aspects out of each base model,
40
+ but may in part weaken other parts out of each model, which can be desirable if the base models have problematic traits that need to be worked on.
41
 
42
  Blend was done in FP32 and output saved in FP16 for reduced storage needs.
43