shng2025
/

gptesla-small-vocabulary__archive-small-tokenizer-only

Inference Endpoints

Model card Files Files and versions Community

shng2025 commited on 2 days ago

Commit

d147b9d

•

1 Parent(s): 92d2a77

Update README.md

Files changed (1) hide show

README.md +8 -6

README.md CHANGED Viewed

@@ -17,19 +17,19 @@ tags: []
 This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
 - **Funded by [optional]:** [More Information Needed]
 - **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
 ### Model Sources [optional]
 <!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
 - **Paper [optional]:** [More Information Needed]
 - **Demo [optional]:** [More Information Needed]
@@ -37,6 +37,8 @@ This is the model card of a 🤗 transformers model that has been pushed on the
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Direct Use
 <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

 This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
+- **Developed by:** shng2025
 - **Funded by [optional]:** [More Information Needed]
 - **Shared by [optional]:** [More Information Needed]
+- **Model type:** LLM, ~150M
+- **Language(s) (NLP):** English, Python
+- **License:** mit
+- **Finetuned from model [optional]:** gpt2 (i think) // Based on Codeparrot
 ### Model Sources [optional]
 <!-- Provide the basic links for the model. -->
+- **Repository:** https://github.com/Ice-Citron/GPTesla
 - **Paper [optional]:** [More Information Needed]
 - **Demo [optional]:** [More Information Needed]
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+This is my preliminary/first model trained based on teachings provided on O'reilly's transformer book. The idea is for this book to guide and teach me on the workflows of training a transformer that's based on CodeParrot. Before I move on and start conducting research for my CS EE. Likely on different ways to optimise how to train transformers from scratch.
 ### Direct Use
 <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->