Anish13 commited on
Commit
81bd1d6
1 Parent(s): 0f813b5

End of training

Browse files
README.md ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ base_model: gpt2
4
+ tags:
5
+ - generated_from_trainer
6
+ model-index:
7
+ - name: pretrained_gpt2
8
+ results: []
9
+ ---
10
+
11
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
+ should probably proofread and complete it, then remove this comment. -->
13
+
14
+ # pretrained_gpt2
15
+
16
+ This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on the None dataset.
17
+ It achieves the following results on the evaluation set:
18
+ - Loss: 6.8305
19
+
20
+ ## Model description
21
+
22
+ More information needed
23
+
24
+ ## Intended uses & limitations
25
+
26
+ More information needed
27
+
28
+ ## Training and evaluation data
29
+
30
+ More information needed
31
+
32
+ ## Training procedure
33
+
34
+ ### Training hyperparameters
35
+
36
+ The following hyperparameters were used during training:
37
+ - learning_rate: 0.0003
38
+ - train_batch_size: 32
39
+ - eval_batch_size: 32
40
+ - seed: 42
41
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
+ - lr_scheduler_type: linear
43
+ - num_epochs: 10
44
+ - mixed_precision_training: Native AMP
45
+
46
+ ### Training results
47
+
48
+ | Training Loss | Epoch | Step | Validation Loss |
49
+ |:-------------:|:-----:|:----:|:---------------:|
50
+ | 6.4795 | 0.94 | 500 | 6.1043 |
51
+ | 5.3968 | 1.87 | 1000 | 5.7712 |
52
+ | 4.7369 | 2.81 | 1500 | 5.6812 |
53
+ | 4.1696 | 3.75 | 2000 | 5.7365 |
54
+ | 3.6165 | 4.68 | 2500 | 5.8735 |
55
+ | 3.098 | 5.62 | 3000 | 6.0607 |
56
+ | 2.595 | 6.55 | 3500 | 6.3035 |
57
+ | 2.1458 | 7.49 | 4000 | 6.5112 |
58
+ | 1.7782 | 8.43 | 4500 | 6.7049 |
59
+ | 1.5026 | 9.36 | 5000 | 6.8305 |
60
+
61
+
62
+ ### Framework versions
63
+
64
+ - Transformers 4.35.2
65
+ - Pytorch 2.1.1
66
+ - Datasets 2.15.0
67
+ - Tokenizers 0.15.0
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 50257,
4
+ "eos_token_id": 50258,
5
+ "transformers_version": "4.35.2"
6
+ }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1ff0a0197cc271be747388bbf5aa38d64795863a137e59aef8c2e14bb8919cac
3
  size 497783424
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ae37a6f78b482dfda3308a107447050b1cb5e31062ca0e328880670906f32502
3
  size 497783424
runs/Feb07_21-26-23_nlp-gpu-01.be.ucsc.edu/events.out.tfevents.1707369984.nlp-gpu-01.be.ucsc.edu.4198.2 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4658c82d9a08e4f7f50b3c97a95b11942cbcb568e403dfc831e9bc488f40b7a0
3
- size 8736
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ae57fbd1fdd0f0b84243fe933a71522ef7c5507fbd29a5c820d7894f04571ac8
3
+ size 9090