jploski
/

mpt-mini-shakespeare

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

mpt-mini-shakespeare / README.md

jploski's picture

Update README.md

566a7e5 over 1 year ago

|

1.42 kB

metadata

tags:
  - generated_from_trainer
model-index:
  - name: mpt-mini-shakespeare
    results: []

mpt-mini-shakespeare

This model was trained from scratch on https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt.

Model description

The configuration and code is adapted from mosaicml/mpt-7b-storywriter, with configuration parameters changed to make it a very tiny model.

Intended uses & limitations

Intended just to aid debugging efforts of a GGML port of mpt-7b-storywriter.

Training and evaluation data

More information needed

Training procedure

Just use the single tinyshakespeare text file as both training and validation set (splitting into paragraphs).

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 32
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 256
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 10
num_epochs: 1

Training results

Mediocre, as expected.

Framework versions

Transformers 4.28.0
Pytorch 2.0.1+cu117
Datasets 2.12.0
Tokenizers 0.13.3