ke-lly
/

47163343_0

+---
+license: mit
+base_model: openai-community/gpt2
+tags:
+- trl
+- sft
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: '47163343_0'
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# 47163343_0
+This model is a fine-tuned version of [openai-community/gpt2](https://huggingface.co/openai-community/gpt2) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.6417
+- Accuracy: 0.0001
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 1.41e-05
+- train_batch_size: 32
+- eval_batch_size: 4
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 2
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 256
+- total_eval_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- training_steps: 2000
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 1.0667        | 0.48  | 25   | 0.9624          | 0.0001   |
+| 0.854         | 0.97  | 50   | 0.8076          | 0.0001   |
+| 0.7965        | 1.45  | 75   | 0.7675          | 0.0002   |
+| 0.7817        | 1.93  | 100  | 0.7461          | 0.0001   |
+| 0.7668        | 2.42  | 125  | 0.7326          | 0.0001   |
+| 0.7533        | 2.9   | 150  | 0.7225          | 0.0001   |
+| 0.7479        | 3.38  | 175  | 0.7150          | 0.0001   |
+| 0.7325        | 3.86  | 200  | 0.7104          | 0.0001   |
+| 0.7397        | 4.35  | 225  | 0.7055          | 0.0001   |
+| 0.7255        | 4.83  | 250  | 0.7020          | 0.0001   |
+| 0.7182        | 5.31  | 275  | 0.6983          | 0.0001   |
+| 0.7161        | 5.8   | 300  | 0.6955          | 0.0001   |
+| 0.7162        | 6.28  | 325  | 0.6925          | 0.0001   |
+| 0.7043        | 6.76  | 350  | 0.6901          | 0.0001   |
+| 0.7076        | 7.25  | 375  | 0.6868          | 0.0001   |
+| 0.7042        | 7.73  | 400  | 0.6844          | 0.0001   |
+| 0.6975        | 8.21  | 425  | 0.6820          | 0.0001   |
+| 0.7021        | 8.7   | 450  | 0.6799          | 0.0001   |
+| 0.6955        | 9.18  | 475  | 0.6778          | 0.0001   |
+| 0.6868        | 9.66  | 500  | 0.6763          | 0.0001   |
+| 0.6866        | 10.14 | 525  | 0.6744          | 0.0001   |
+| 0.6903        | 10.63 | 550  | 0.6723          | 0.0001   |
+| 0.6786        | 11.11 | 575  | 0.6709          | 0.0001   |
+| 0.6843        | 11.59 | 600  | 0.6695          | 0.0001   |
+| 0.6835        | 12.08 | 625  | 0.6680          | 0.0001   |
+| 0.6819        | 12.56 | 650  | 0.6669          | 0.0001   |
+| 0.6804        | 13.04 | 675  | 0.6656          | 0.0001   |
+| 0.6748        | 13.53 | 700  | 0.6642          | 0.0001   |
+| 0.674         | 14.01 | 725  | 0.6637          | 0.0001   |
+| 0.6731        | 14.49 | 750  | 0.6624          | 0.0001   |
+| 0.681         | 14.98 | 775  | 0.6611          | 0.0001   |
+| 0.6763        | 15.46 | 800  | 0.6602          | 0.0001   |
+| 0.677         | 15.94 | 825  | 0.6597          | 0.0001   |
+| 0.6725        | 16.43 | 850  | 0.6583          | 0.0001   |
+| 0.6669        | 16.91 | 875  | 0.6574          | 0.0001   |
+| 0.6682        | 17.39 | 900  | 0.6567          | 0.0001   |
+| 0.669         | 17.87 | 925  | 0.6559          | 0.0001   |
+| 0.6647        | 18.36 | 950  | 0.6554          | 0.0001   |
+| 0.664         | 18.84 | 975  | 0.6549          | 0.0001   |
+| 0.6563        | 19.32 | 1000 | 0.6542          | 0.0001   |
+| 0.6656        | 19.81 | 1025 | 0.6533          | 0.0001   |
+| 0.6634        | 20.29 | 1050 | 0.6530          | 0.0001   |
+| 0.6592        | 20.77 | 1075 | 0.6521          | 0.0001   |
+| 0.6558        | 21.26 | 1100 | 0.6514          | 0.0001   |
+| 0.6664        | 21.74 | 1125 | 0.6511          | 0.0001   |
+| 0.6561        | 22.22 | 1150 | 0.6504          | 0.0001   |
+| 0.6634        | 22.71 | 1175 | 0.6499          | 0.0001   |
+| 0.6679        | 23.19 | 1200 | 0.6491          | 0.0001   |
+| 0.6625        | 23.67 | 1225 | 0.6489          | 0.0001   |
+| 0.6619        | 24.15 | 1250 | 0.6483          | 0.0001   |
+| 0.6495        | 24.64 | 1275 | 0.6479          | 0.0001   |
+| 0.6547        | 25.12 | 1300 | 0.6474          | 0.0001   |
+| 0.6649        | 25.6  | 1325 | 0.6469          | 0.0001   |
+| 0.6551        | 26.09 | 1350 | 0.6466          | 0.0001   |
+| 0.6547        | 26.57 | 1375 | 0.6463          | 0.0001   |
+| 0.6546        | 27.05 | 1400 | 0.6458          | 0.0001   |
+| 0.6576        | 27.54 | 1425 | 0.6456          | 0.0001   |
+| 0.6576        | 28.02 | 1450 | 0.6452          | 0.0001   |
+| 0.6568        | 28.5  | 1475 | 0.6448          | 0.0001   |
+| 0.6596        | 28.99 | 1500 | 0.6446          | 0.0001   |
+| 0.6538        | 29.47 | 1525 | 0.6443          | 0.0001   |
+| 0.6488        | 29.95 | 1550 | 0.6440          | 0.0001   |
+| 0.6433        | 30.43 | 1575 | 0.6437          | 0.0001   |
+| 0.6583        | 30.92 | 1600 | 0.6435          | 0.0001   |
+| 0.6575        | 31.4  | 1625 | 0.6432          | 0.0001   |
+| 0.6465        | 31.88 | 1650 | 0.6430          | 0.0001   |
+| 0.6495        | 32.37 | 1675 | 0.6429          | 0.0001   |
+| 0.6487        | 32.85 | 1700 | 0.6427          | 0.0001   |
+| 0.6571        | 33.33 | 1725 | 0.6426          | 0.0001   |
+| 0.6463        | 33.82 | 1750 | 0.6425          | 0.0001   |
+| 0.648         | 34.3  | 1775 | 0.6423          | 0.0001   |
+| 0.6537        | 34.78 | 1800 | 0.6422          | 0.0001   |
+| 0.6564        | 35.27 | 1825 | 0.6420          | 0.0001   |
+| 0.6491        | 35.75 | 1850 | 0.6420          | 0.0001   |
+| 0.6549        | 36.23 | 1875 | 0.6419          | 0.0001   |
+| 0.6524        | 36.71 | 1900 | 0.6418          | 0.0001   |
+| 0.6522        | 37.2  | 1925 | 0.6418          | 0.0001   |
+| 0.655         | 37.68 | 1950 | 0.6417          | 0.0001   |
+| 0.6614        | 38.16 | 1975 | 0.6417          | 0.0001   |
+| 0.6451        | 38.65 | 2000 | 0.6417          | 0.0001   |
+### Framework versions
+- Transformers 4.37.0
+- Pytorch 2.0.0+cu118
+- Datasets 2.16.1
+- Tokenizers 0.15.1

generation_config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 50256,
+  "eos_token_id": 50256,
+  "transformers_version": "4.37.0"
+}