davidxmle commited on
Commit
5b3086f
1 Parent(s): 316c745

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -39,7 +39,7 @@ tags:
39
  - Created by [David Xue](https://www.linkedin.com/in/david-xue-uva/) from [Astronomer](https://astronomer.io)
40
 
41
  ## Description
42
- - This is the exact same model ([meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)) with the weights for the input and output embeddings from lm head and embedding matrix adjusted for certain tokens that were untrained which caused widespread issues for people attempting to fine-tune this base model with either adding their own tokens or using existing special tokens.
43
 
44
  ## Why We Made This Model
45
 
@@ -351,4 +351,6 @@ Once these untrained tokens are identified, the average of trained tokens can be
351
 
352
  Lastly, the problematic token's rows in the 2 embedding matrics are set to the computed mean, thus completing the adjustment.
353
 
 
 
354
 
 
39
  - Created by [David Xue](https://www.linkedin.com/in/david-xue-uva/) from [Astronomer](https://astronomer.io)
40
 
41
  ## Description
42
+ This is the exact same model ([meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)) with the weights for the input and output embeddings from lm head and embedding matrix adjusted for certain tokens that were untrained which caused widespread issues for people attempting to fine-tune this base model with either adding their own tokens or using existing special tokens.
43
 
44
  ## Why We Made This Model
45
 
 
351
 
352
  Lastly, the problematic token's rows in the 2 embedding matrics are set to the computed mean, thus completing the adjustment.
353
 
354
+ ## Contributors
355
+ - [David Xue](https://www.linkedin.com/in/david-xue-uva/), Machine Learning Engineer from [Astronomer](https://astronomer.io)
356