Update README.md
Browse files
README.md
CHANGED
@@ -52,21 +52,25 @@ parameters:
|
|
52 |
early_stopping: True
|
53 |
---
|
54 |
|
55 |
-
# flan-t5-xl
|
56 |
|
57 |
-
|
|
|
|
|
|
|
|
|
58 |
|
59 |
## Model description
|
60 |
|
61 |
-
The intent is to create a text2text language model that successfully
|
62 |
|
63 |
-
Compare some of the
|
64 |
|
65 |
## Limitations
|
66 |
|
67 |
-
-
|
68 |
-
-
|
69 |
-
- currently **
|
70 |
|
71 |
## Training and evaluation data
|
72 |
|
@@ -76,6 +80,13 @@ More information needed
|
|
76 |
|
77 |
### Training hyperparameters
|
78 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
79 |
The following hyperparameters were used during training:
|
80 |
- learning_rate: 4e-05
|
81 |
- train_batch_size: 4
|
|
|
52 |
early_stopping: True
|
53 |
---
|
54 |
|
55 |
+
# grammar-synthesis: flan-t5-xl
|
56 |
|
57 |
+
<a href="https://colab.research.google.com/gist/pszemraj/43fc6a5c5acd94a3d064384dd1f3654c/demo-flan-t5-xl-grammar-synthesis.ipynb">
|
58 |
+
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
59 |
+
</a>
|
60 |
+
|
61 |
+
This model is a fine-tuned version of [google/flan-t5-xl](https://huggingface.co/google/flan-t5-xl) on an extended version of the `JFLEG` dataset.
|
62 |
|
63 |
## Model description
|
64 |
|
65 |
+
The intent is to create a text2text language model that successfully performs "single-shot grammar correction" on a potentially grammatically incorrect text **that could have many errors** with the important qualifier that **it does not semantically change text/information that IS grammatically correct.**.
|
66 |
|
67 |
+
Compare some of the more severe error examples on [other grammar correction models](https://huggingface.co/models?dataset=dataset:jfleg) to see the difference :)
|
68 |
|
69 |
## Limitations
|
70 |
|
71 |
+
- Data set: `cc-by-nc-sa-4.0`
|
72 |
+
- Model: `apache-2.0`
|
73 |
+
- currently **a work in progress**! While probably useful for "single-shot grammar correction" in many cases, **check the output for correctness, ok?**.
|
74 |
|
75 |
## Training and evaluation data
|
76 |
|
|
|
80 |
|
81 |
### Training hyperparameters
|
82 |
|
83 |
+
|
84 |
+
#### Session One
|
85 |
+
|
86 |
+
- TODO: add this. It was a single epoch at higher LR
|
87 |
+
|
88 |
+
#### Session Two
|
89 |
+
|
90 |
The following hyperparameters were used during training:
|
91 |
- learning_rate: 4e-05
|
92 |
- train_batch_size: 4
|