pszemraj's picture
Update README.md
bf9384f
|
raw
history blame
3.79 kB
metadata
license: cc-by-nc-sa-4.0
tags:
  - grammar
  - spelling
  - punctuation
  - error-correction
datasets:
  - jfleg
widget:
  - text: i can has cheezburger
    example_title: cheezburger
  - text: There car broke down so their hitching a ride to they're class.
    example_title: compound-1
  - text: >-
      so em if we have an now so with fito ringina know how to estimate the tren
      given the ereafte mylite trend we can also em an estimate is nod s i again
      tort watfettering an we have estimated the trend an called wot to be
      called sthat of exty right now we can and look at wy this should not hare
      a trend i becan we just remove the trend an and we can we now estimate
      tesees ona effect of them exty
    example_title: Transcribed Audio Example 2
  - text: >-
      My coworker said he used a financial planner to help choose his stocks so
      he wouldn't loose money.
    example_title: incorrect word choice (context)
  - text: >-
      good so hve on an tadley i'm not able to make it to the exla session on
      monday this week e which is why i am e recording pre recording an this
      excelleision and so to day i want e to talk about two things and first of
      all em i wont em wene give a summary er about ta ohow to remove trents in
      these nalitives from time series
    example_title: lowercased audio transcription output
  - text: Frustrated, the chairs took me forever to set up.
    example_title: dangling modifier
  - text: I would like a peice of pie.
    example_title: miss-spelling
  - text: >-
      Which part of Zurich was you going to go hiking in when we were there for
      the first time together? ! ?
    example_title: chatbot on Zurich
parameters:
  max_length: 128
  min_length: 4
  num_beams: 4
  repetition_penalty: 1.21
  length_penalty: 1
  early_stopping: true

A more recent version can be found here. Training smaller and/or comparably sized models is a WIP.

t5-v1_1-base-ft-jflAUG

GOAL: a more robust and generalized grammar and spelling correction model that corrects everything in a single shot. It should have a minimal impact on the semantics of correct sentences (i.e. it does not change things that do not need to be changed).

  • this model (at least from preliminary testing) can handle large amounts of errors in the source text (i.e. from audio transcription) and still produce cohesive results.
  • a fine-tuned version of google/t5-v1_1-base on an expanded version of the JFLEG dataset.

Model description

  • this is a WIP. This fine-tuned model is v1.
  • long term: a generalized grammar and spelling correction model that can handle lots of things at the same time.
  • currently, it seems to be more of a "gibberish to mostly correct English" translator

Intended uses & limitations

  • try some tests with the examples here
  • thus far, some limitations are: sentence fragments are not autocorrected (at least, if entered individually), some more complicated pronoun/they/he/her etc. agreement is not always fixed.

Training and evaluation data

  • trained as text-to-text
  • JFLEG dataset + additional selected and/or generated grammar corrections

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 5

Framework versions

  • Transformers 4.17.0
  • Pytorch 1.10.0+cu111
  • Datasets 2.0.0
  • Tokenizers 0.11.6