gpt-2-finetuned-wikitext2
This model is a fine-tuned version of openai-community/gpt2 on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 3.3924
Model Description
This language model is built on the GPT-2 architecture provided by OpenAI. The tokenizer utilized for preprocessing text data is OpenAI's tikToken. For more details on tikToken, you can refer to the official GitHub repository.
Tokenizer Overview
To interactively explore the functionality and behavior of the tikToken tokenizer, you can use the tikToken interactive website. This website allows you to quickly visualize the tokenization process and understand how the tokenizer segments input text into tokens.
Model Checkpoint
The model checkpoint used in this implementation is sourced from the OpenAI community and is based on the GPT-2 architecture. You can find the specific model checkpoint at the following Hugging Face Model Hub link: openai-community/gpt2.
Training Details
The model was trained for a total of 3 epochs on the provided dataset. This information reflects the number of times the entire training dataset was processed during the training phase. Training for a specific number of epochs helps control the duration and scope of the model's learning process.
Training and evaluation data
Evaluation Data
For evaluating the model's performance, the training script utilized an evaluation dataset.
Evaluation Results
After training, the model's performance was assessed using the evaluation dataset. The perplexity, a common metric for language modeling tasks was Perplexity: 29.74
eval_results = trainer.evaluate()
print(f"Perplexity: {math.exp(eval_results['eval_loss']):.2f}")
>>> Perplexity : 29.74
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3.0
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
3.4934 | 1.0 | 2334 | 3.4145 |
3.3567 | 2.0 | 4668 | 3.3953 |
3.2968 | 3.0 | 7002 | 3.3924 |
Framework versions
- Transformers 4.37.2
- Pytorch 2.1.0+cu121
- Datasets 2.17.1
- Tokenizers 0.15.2
- Downloads last month
- 16
Model tree for brooksideas/gpt-2-finetuned-wikitext2
Base model
openai-community/gpt2