van-ng
/

gpt2-XYZCompany-500-steps

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

van-ng commited on Mar 12

Commit

c6a277c

•

1 Parent(s): 94cf122

Update README.md

Files changed (1) hide show

README.md +4 -17

README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 license: mit
-base_model: gpt2
 tags:
 - generated_from_trainer
 model-index:
@@ -13,9 +13,8 @@ should probably proofread and complete it, then remove this comment. -->
 # gpt2-XYZCompany-500-steps
-This model is a question-answer chatbot for XYZCompany. It can answer questions related to the company. It is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on XYZCompany's dataset containing question-answer pairs.
-It achieves the following results on the evaluation set:
-- Loss: 0.3300
 ## Model description
@@ -41,22 +40,10 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 1
-- training_steps: 500
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| 0.471         | 0.32  | 50   | 0.4135          |
-| 0.4572        | 0.63  | 100  | 0.3736          |
-| 0.3903        | 0.95  | 150  | 0.3574          |
-| 0.3748        | 1.27  | 200  | 0.3474          |
-| 0.3639        | 1.58  | 250  | 0.3413          |
-| 0.3515        | 1.9   | 300  | 0.3366          |
-| 0.3539        | 2.22  | 350  | 0.3337          |
-| 0.3604        | 2.53  | 400  | 0.3319          |
-| 0.3579        | 2.85  | 450  | 0.3305          |
-| 0.3176        | 3.16  | 500  | 0.3300          |
 ### Framework versions

 ---
 license: mit
+base_model: EleutherAI/pythia-160m
 tags:
 - generated_from_trainer
 model-index:
 # gpt2-XYZCompany-500-steps
+This model is a question-answer chatbot for XYZCompany. It can answer questions related to the company. It is a fine-tuned version of [pythia-160m](https://huggingface.co/EleutherAI/pythia-160m) on XYZCompany's dataset containing question-answer pairs.
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 1
+- training_steps: 1000 (~6 epochs)
 ### Training results
 ### Framework versions