jlpan
/

starcoder-cpp2py-newsnippet1

Generated from Trainer

Model card Files Files and versions Community

jlpan commited on Aug 23, 2023

Commit

56c380c

·

1 Parent(s): 3f2d6f0

update model card README.md

Files changed (1) hide show

README.md +11 -16

README.md CHANGED Viewed

@@ -6,7 +6,6 @@ tags:
 model-index:
 - name: starcoder-cpp2py-newsnippet1
   results: []
-library_name: peft
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -16,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [bigcode/starcoder](https://huggingface.co/bigcode/starcoder) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.1966
 ## Model description
@@ -36,11 +35,11 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 9e-05
-- train_batch_size: 4
-- eval_batch_size: 4
 - seed: 42
-- gradient_accumulation_steps: 32
-- total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 15
@@ -50,20 +49,16 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 4.3776        | 0.17  | 25   | 0.4660          |
-| 0.2863        | 0.33  | 50   | 0.2139          |
-| 0.214         | 0.5   | 75   | 0.2023          |
-| 0.2076        | 0.67  | 100  | 0.1981          |
-| 0.2079        | 0.83  | 125  | 0.1968          |
-| 0.2008        | 1.0   | 150  | 0.1966          |
 ### Framework versions
-- PEFT 0.5.0.dev0
-- PEFT 0.5.0.dev0
-- PEFT 0.5.0.dev0
-- PEFT 0.5.0.dev0
 - Transformers 4.32.0.dev0
 - Pytorch 2.0.1+cu117
 - Datasets 2.12.0

 model-index:
 - name: starcoder-cpp2py-newsnippet1
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 This model is a fine-tuned version of [bigcode/starcoder](https://huggingface.co/bigcode/starcoder) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1961
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 9e-05
+- train_batch_size: 32
+- eval_batch_size: 32
 - seed: 42
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 256
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 15
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 4.3812        | 0.17  | 25   | 0.4652          |
+| 0.2923        | 0.33  | 50   | 0.2125          |
+| 0.2148        | 0.5   | 75   | 0.2013          |
+| 0.2051        | 0.67  | 100  | 0.1971          |
+| 0.2003        | 0.83  | 125  | 0.1964          |
+| 0.1882        | 1.05  | 150  | 0.1961          |
 ### Framework versions
 - Transformers 4.32.0.dev0
 - Pytorch 2.0.1+cu117
 - Datasets 2.12.0