astronomer
/

Llama-3-8B-Special-Tokens-Adjusted

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

davidxmle commited on Apr 22

Commit

5b3086f

•

1 Parent(s): 316c745

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -39,7 +39,7 @@ tags:
 - Created by [David Xue](https://www.linkedin.com/in/david-xue-uva/) from [Astronomer](https://astronomer.io)
 ## Description
-- This is the exact same model ([meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)) with the weights for the input and output embeddings from lm head and embedding matrix adjusted for certain tokens that were untrained which caused widespread issues for people attempting to fine-tune this base model with either adding their own tokens or using existing special tokens.
 ## Why We Made This Model
@@ -351,4 +351,6 @@ Once these untrained tokens are identified, the average of trained tokens can be
 Lastly, the problematic token's rows in the 2 embedding matrics are set to the computed mean, thus completing the adjustment.

 - Created by [David Xue](https://www.linkedin.com/in/david-xue-uva/) from [Astronomer](https://astronomer.io)
 ## Description
+This is the exact same model ([meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)) with the weights for the input and output embeddings from lm head and embedding matrix adjusted for certain tokens that were untrained which caused widespread issues for people attempting to fine-tune this base model with either adding their own tokens or using existing special tokens.
 ## Why We Made This Model
 Lastly, the problematic token's rows in the 2 embedding matrics are set to the computed mean, thus completing the adjustment.
+## Contributors
+- [David Xue](https://www.linkedin.com/in/david-xue-uva/), Machine Learning Engineer from [Astronomer](https://astronomer.io)