jbochi's picture
Update README.md
3d0cd42
|
raw
history blame
4.38 kB
metadata
license: apache-2.0
base_model: google/flan-t5-large
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: flan-t5-large-spelling-peft
    results: []

flan-t5-large-spelling-peft

This model is an experimental peft adapter for google/flan-t5-large trained on the wiki.en dataset from oliverguhr/spelling.

It achieves the following results on the evaluation set:

  • Loss: 0.2537
  • Rouge1: 95.8905
  • Rouge2: 91.9178
  • Rougel: 95.8459
  • Rougelsum: 95.8393
  • Gen Len: 33.61

Model description

This an experimental model that should be capable of fixing typos and punctuation.

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, pipeline

model_id = "google/flan-t5-large"
peft_model_id = "jbochi/flan-t5-large-spelling-peft"

model = AutoModelForSeq2SeqLM.from_pretrained(model_id)
model.load_adapter(peft_model_id)

tokenizer = AutoTokenizer.from_pretrained(model_id)

pipe = pipeline("text2text-generation", model=model, tokenizer=tokenizer)
pipe("Fix spelling: This restuarant is awesome")
# [{'generated_text': 'This restaurant is awesome'}]

Intended uses & limitations

Intented for research purposes.

  • It may produce artifacts.
  • Doesn't seen capable of fixing multiple errors in a single sentence.
  • It doesn't support languages other than English.
  • It was fine-tuned with a max_length of 100 tokens.

Training and evaluation data

Data from oliverguhr/spelling, with a "Fix spelling: " prefix added to every example.

The model was only evaluated on the first 100 test examples only during training.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.3359 0.05 500 0.2738 95.8385 91.6723 95.7821 95.766 33.5
0.2853 0.11 1000 0.2702 95.7124 91.5043 95.656 95.651 33.53
0.2691 0.16 1500 0.2691 95.735 91.7108 95.7039 95.7067 33.41
0.2596 0.21 2000 0.2663 95.9819 92.0897 95.9519 95.9488 33.51
0.2536 0.27 2500 0.2621 95.7519 91.5445 95.6614 95.6622 33.49
0.2472 0.32 3000 0.2626 95.7052 91.7321 95.6476 95.6512 33.58
0.2448 0.37 3500 0.2669 95.8003 91.7949 95.7536 95.7576 33.57
0.2345 0.43 4000 0.2582 95.8784 92.008 95.8284 95.8343 33.65
0.2345 0.48 4500 0.2629 95.8131 91.9088 95.7624 95.766 33.63
0.2284 0.53 5000 0.2585 95.8552 91.9833 95.8105 95.8135 33.62
0.2266 0.59 5500 0.2591 95.9205 92.0577 95.8689 95.8718 33.61
0.2281 0.64 6000 0.2605 95.9172 91.9782 95.874 95.8638 33.59
0.2228 0.69 6500 0.2566 95.7612 91.7858 95.7129 95.7058 33.63
0.2202 0.75 7000 0.2561 95.9468 92.0914 95.9018 95.8941 33.64
0.218 0.8 7500 0.2579 95.9468 92.0914 95.9018 95.8941 33.64
0.2162 0.85 8000 0.2523 95.8231 91.9464 95.7727 95.7758 33.66
0.2135 0.91 8500 0.2549 95.8388 91.9804 95.7914 95.7917 33.63
0.2124 0.96 9000 0.2537 95.8905 91.9178 95.8459 95.8393 33.61

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.0
  • Tokenizers 0.15.0