pszemraj's picture
Update README.md
460f4d6
|
raw
history blame
No virus
2.71 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
model-index:
  - name: t5-v1_1-base-ft-jflAUG
widget:
  - text: Anna and Mike is going skiing
    example_title: skiing
  - text: >-
      so em if we have an now so with fito ringina know how to estimate the tren
      given the ereafte mylite trend we can also em an estimate is nod s i again
      tort watfettering an we have estimated the trend an called wot to be
      called sthat of exty right now we can and look at wy this should not hare
      a trend i becan we just remove the trend an and we can we now estimate
      tesees ona effect of them exty
    example_title: Transcribed Audio Example 2
  - text: I would like a peice of pie.
    example_title: miss-spelling
  - text: >-
      My coworker said he used a financial planner to help choose his stocks so
      he wouldn't loose money.
    example_title: incorrect word choice (context)
  - text: >-
      good so hve on an tadley i'm not able to make it to the exla session on
      monday this week e which is why i am e recording pre recording an this
      excelleision and so to day i want e to talk about two things and first of
      all em i wont em wene give a summary er about ta ohow to remove trents in
      these nalitives from time series
    example_title: lowercased audio transcription output
  - text: Frustrated, the chairs took me forever to set up.
    example_title: dangling modifier
  - text: There car broke down so their hitching a ride to they're class.
    example_title: compound-1
inference:
  parameters:
    no_repeat_ngram_size: 2
    max_length: 64
    min_length: 4
    num_beams: 4
    repetition_penalty: 3.51
    length_penalty: 0.8
    early_stopping: true

t5-v1_1-base-ft-jflAUG

This model is a fine-tuned version of google/t5-v1_1-base on an expanded version of the JFLEG dataset.

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 5

Training results

Framework versions

  • Transformers 4.17.0
  • Pytorch 1.10.0+cu111
  • Datasets 2.0.0
  • Tokenizers 0.11.6