|
--- |
|
license: mit |
|
tags: |
|
- generated_from_trainer |
|
model-index: |
|
- name: GPT2-Medium-Alpaca-355m |
|
results: [] |
|
datasets: |
|
- tatsu-lab/alpaca |
|
widget: |
|
- text: |- |
|
|
|
You are a chat bot that provides professional answers to questions asked |
|
|
|
|
|
What is the purpose of life |
|
|
|
|
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# OpenAI GPT-2 355M |
|
|
|
## Model description |
|
|
|
This custom GPT-2 model is derived from the [gpt2-medium](https://huggingface.co/gpt2-medium) model and trained on the Alpaca dataset. Anezatra team meticulously trained this model on the Alpaca dataset for natural language processing tasks. The model excels in text generation and language understanding tasks, making it ideal for chat applications. |
|
|
|
## Training Procedure |
|
|
|
This model was trained with 4 x A100 GPUs |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 2e-05 |
|
- train_batch_size: 1 |
|
- eval_batch_size: 1 |
|
- seed: 42 |
|
- gradient_accumulation_steps: 128 |
|
- total_train_batch_size: 128 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- lr_scheduler_warmup_ratio: 0.15 |
|
- num_epochs: 1 |