Edit model card

Introduction: This repository contains a finetuned DistilGPT2 model for generating diverse essays on topics spanning Arts, Science, and Culture.

Dataset: The training dataset comprises 2000+ essays covering diverse topics in Arts, Science, and Culture. These essays are written by human experts and contain a diverse set of opinions and knowledge, ensuring that the model learns from high-quality and diverse content.

Model Training:

  • epoch: 50
  • training_loss: 2.473200
  • validation_loss: 4.569556
  • perplexities: [517.4149169921875, 924.535888671875, 704.73291015625, 465.9677429199219, 577.629150390625, 443.994140625, 770.1861572265625, 683.028076171875, 1017.7510375976562, 880.795166015625]
  • mean_perplexity: 698.603519

Description: The model achieved a mean perplexity of 698.603519 on the validation set, indicating its ability to generate diverse and high-quality essays on the given topics.

During Text Generation, the following parameters are used:

  • max_length: The maximum length of the generated text, set to 400 tokens.
  • num_beams: The number of beams for beam search, set to 10. A higher value will increase the diversity of the generated text but may also increase the inference time.
  • early_stopping: If set to True, the generation will stop as soon as the end-of-sequence token is generated.
  • temperature: The sampling temperature, is set to 0.3.
  • no_repeat_ngram_size: The size of the n-gram window to avoid repetitions, set to 2.

image/png

Find the kaggle notebook for this project at

Kaggle Notebook

Downloads last month
1
Safetensors
Model size
81.9M params
Tensor type
F32
·