vldsavelyev
/

murakami_rugpt3small

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

vldsavelyev commited on Mar 25, 2023

Commit

ba810ec

•

1 Parent(s): e520818

Update model card

Files changed (1) hide show

README.md +8 -9

README.md CHANGED Viewed

@@ -1,4 +1,6 @@
 ---
 tags:
 - generated_from_trainer
 datasets:
@@ -8,24 +10,20 @@ model-index:
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # murakami_rugpt3small
-This model was trained from scratch on the murakami dataset.
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
@@ -38,8 +36,9 @@ The following hyperparameters were used during training:
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 3.0
-- mixed_precision_training: Native AMP
 ### Training results

 ---
+language:
+- ru
 tags:
 - generated_from_trainer
 datasets:
   results: []
 ---
 # murakami_rugpt3small
 ## Model description
+Fine-tuned from [sberbank-ai/rugpt3small_based_on_gpt2](https://huggingface.co/sberbank-ai/rugpt3small_based_on_gpt2)
 ## Intended uses & limitations
+Generate articles
 ## Training and evaluation data
+Fine-tuned on [murakami](https://huggingface.co/datasets/vldsavelyev/murakami) dataset,
+which was built from Russian translations of novels by Haruki Murakami.
 ## Training procedure
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- gradient_checkpointing: True
 - num_epochs: 3.0
+- mixed_precision_training: Native AMP (fp16=True)
 ### Training results