Karzan
/

gpt2-walamakan

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Karzan commited on Aug 26, 2023

Commit

d80e833

•

1 Parent(s): beb29bb

End of training

Files changed (4) hide show

README.md +58 -0
config.json +1 -1
pytorch_model.bin +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -1,3 +1,61 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
+base_model: Karzan/gpt2-walamakan
+tags:
+- generated_from_trainer
+model-index:
+- name: gpt2-walamakan
+  results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# gpt2-walamakan
+This model is a fine-tuned version of [Karzan/gpt2-walamakan](https://huggingface.co/Karzan/gpt2-walamakan) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 7.1414
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 3e-05
+- train_batch_size: 32
+- eval_batch_size: 32
+- seed: 42
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 128
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 3
+### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 0.3787        | 0.98  | 23   | 7.1415          |
+| 0.3827        | 2.0   | 47   | 7.1610          |
+| 0.3831        | 2.94  | 69   | 7.1414          |
+### Framework versions
+- Transformers 4.32.0
+- Pytorch 2.0.1+cu118
+- Datasets 2.14.4
+- Tokenizers 0.13.3

config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "_name_or_path": "/content/drive/MyDrive/models/gpt2-text-generator-ckb",
   "activation_function": "gelu_new",
   "architectures": [
     "GPT2LMHeadModel"

 {
+  "_name_or_path": "Karzan/gpt2-walamakan",
   "activation_function": "gelu_new",
   "architectures": [
     "GPT2LMHeadModel"

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f493ce30e1467596ea50fdd8fad6d03cf2b1cebb108c45b5f7af042b5ab5a3b5
 size 854378685

 version https://git-lfs.github.com/spec/v1
+oid sha256:2875784e77a873ce834aa3a5147396e4c0a9a1bc9b974e6de3d1d449d03f2aa1
 size 854378685

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:df813b8d42c1b14a91ab846fc88983204128816e2732ac51f1334e7294a0e5fe
 size 4027

 version https://git-lfs.github.com/spec/v1
+oid sha256:8ac618d5b21b64d9490276c9883ad97ea5088db4065491fe1f1131edac056c6f
 size 4027