|
--- |
|
tags: |
|
- gpt-neo |
|
- gpt-peter |
|
- chatbot |
|
inference: false |
|
base_model: EleutherAI/gpt-neo-2.7B |
|
--- |
|
|
|
|
|
# pszemraj/gpt-peter-2.7B |
|
|
|
- This model is a fine-tuned version of [EleutherAI/gpt-neo-2.7B](https://huggingface.co/EleutherAI/gpt-neo-2.7B) on about 80k WhatsApp and iMessage texts. |
|
- The model is too large to use the inference API. linked [here](https://colab.research.google.com/gist/pszemraj/a59b43813437b43973c8f8f9a3944565/testing-pszemraj-gpt-peter-2-7b.ipynb) is a notebook for testing in Colab. |
|
- alternatively, you can message [a bot on telegram](http://t.me/GPTPeter_bot) where I test LLMs for dialogue generation |
|
- the telegram bot code and the model training code can be found [in this repository](https://github.com/pszemraj/ai-msgbot) |
|
|
|
|
|
## Usage in python |
|
|
|
Install the transformers library if you don't have it: |
|
``` |
|
pip install -U transformers |
|
``` |
|
|
|
load the model into a `pipeline` object: |
|
|
|
``` |
|
from transformers import pipeline |
|
import torch |
|
my_chatbot = pipeline('text-generation', |
|
'pszemraj/gpt-peter-2.7B', |
|
device=0 if torch.cuda.is_available() else -1, |
|
) |
|
``` |
|
|
|
generate text! |
|
|
|
``` |
|
my_chatbot('Did you ever hear the tragedy of Darth Plagueis The Wise?') |
|
``` |
|
|
|
_(example above for simplicity, but adding generation parameters such as `no_repeat_ngram_size` are recommended to get better generations)_ |
|
|
|
## Training procedure |
|
|
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 6e-05 |
|
- train_batch_size: 2 |
|
- eval_batch_size: 2 |
|
- seed: 42 |
|
- distributed_type: multi-GPU |
|
- gradient_accumulation_steps: 32 |
|
- total_train_batch_size: 64 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: cosine |
|
- lr_scheduler_warmup_ratio: 0.05 |
|
- num_epochs: 1 |
|
|
|
### Framework versions |
|
|
|
- Transformers 4.17.0 |
|
- Pytorch 1.10.0+cu113 |
|
- Datasets 2.0.0 |
|
- Tokenizers 0.11.6 |
|
|