cointegrated
commited on
Commit
•
ca09afa
1
Parent(s):
8ee4042
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
The model has been trained [here](https://git.mts.ai/ai/ml_lab/skoltech-nlp_lab/skoltech/task_oriented_TST/-/blob/main/transfer/formality_ranker_v1.ipynb) to predict for English sentences, whether they are formal or informal.
|
2 |
+
|
3 |
+
Base model: `roberta-base`
|
4 |
+
|
5 |
+
Datasets: [GYAFC](https://github.com/raosudha89/GYAFC-corpus) from [Rao and Tetreault, 2018](https://aclanthology.org/N18-1012) and [online formality corpus](http://www.seas.upenn.edu/~nlp/resources/formality-corpus.tgz) from [Pavlick and Tetreault, 2016](https://aclanthology.org/Q16-1005).
|
6 |
+
|
7 |
+
Data augmentation: changing texts to upper or lower case; removing all punctuation, adding dot at the end of a sentence.
|
8 |
+
|
9 |
+
Loss: binary classification (on GYAFC), in-batch ranking (on PT data).
|