SubhrajitSain
/

anwgpt2-355m

Text Generation

text-generation-inference

Model card Files Files and versions

SubhrajitSain commited on Sep 7

Commit

4055d3b

·

verified ·

1 Parent(s): 3ac7be1

Update README.md

Files changed (1) hide show

README.md +25 -16

README.md CHANGED Viewed

@@ -2,32 +2,41 @@
 library_name: transformers
 base_model: SubhrajitSain/anwgpt2-345m
 tags:
-- generated_from_trainer
 model-index:
 - name: anwgpt2-345m
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# anwgpt2-345m
-This model is a fine-tuned version of [SubhrajitSain/anwgpt2-345m](https://huggingface.co/SubhrajitSain/anwgpt2-345m) on an unknown dataset.
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
 ### Training hyperparameters
@@ -45,4 +54,4 @@ The following hyperparameters were used during training:
 - Transformers 4.56.0
 - Pytorch 2.8.0+cu126
 - Datasets 2.14.6
-- Tokenizers 0.22.0

 library_name: transformers
 base_model: SubhrajitSain/anwgpt2-345m
 tags:
+- gpt2
+- gpt2-medium
 model-index:
 - name: anwgpt2-345m
   results: []
+license: mit
+datasets:
+- Elriggs/openwebtext-100k
+language:
+- en
+pipeline_tag: text-generation
 ---
+# anwgpt2-355m
+My second attempt at a LLM.
+## Model Details
+*   **Model Type:** GPT-2
+*   **Model Size:** 354,823,168 parameters
+*   **Base Model:** `gpt2-medium`
+*   **Dataset:** Elriggs/openwebtext-100k
+*   **Training Framework:** Hugging Face Transformers
+### Intended Use
+This model is intended for text generation tasks.
+### Training
+The model was fine-tuned on the `Elriggs/openwebtext-100k` dataset.
+### Evaluation
+Evaluation was not recorded.
+### Limitations
+May perform repetitions, but very not likely.
 ### Training hyperparameters
 - Transformers 4.56.0
 - Pytorch 2.8.0+cu126
 - Datasets 2.14.6
+- Tokenizers 0.22.0