Edit model card

argpt2-goodreads

This model is a fine-tuned version of gpt2-medium on an goodreads LABR dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4389

Model description

Generate sentences either positive/negative examples based on goodreads corpus in arabic language.

Intended uses & limitations

the model fine-tuned on arabic language only with aspect to generate sentences such as reviews in order todo the same for other languages you need to fine-tune it in your own. any harmful content generated by GPT2 should not be used in anywhere.

Training and evaluation data

training and validation done on goodreads dataset LABR 80% for trainng and 20% for testing

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("mofawzy/argpt2-goodreads")

model = AutoModelForCausalLM.from_pretrained("mofawzy/argpt2-goodreads")

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: tpu
  • num_devices: 8
  • total_train_batch_size: 128
  • total_eval_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20.0

Training results

  • train_loss = 1.474

Evaluation results

  • eval_loss = 1.4389

train metrics

  • epoch = 20.0
  • train_loss = 1.474
  • train_runtime = 2:18:14.51
  • train_samples = 108110
  • train_samples_per_second = 260.678
  • train_steps_per_second = 2.037

eval metrics

  • epoch = 20.0
  • eval_loss = 1.4389
  • eval_runtime = 0:04:37.01
  • eval_samples = 27329
  • eval_samples_per_second = 98.655
  • eval_steps_per_second = 0.773
  • perplexity = 4.2162

Framework versions

  • Transformers 4.13.0.dev0
  • Pytorch 1.10.0+cu102
  • Datasets 1.16.1
  • Tokenizers 0.10.3
Downloads last month
387
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.