How to use this model?

by LeMoussel - opened Mar 23, 2023

Discussion

LeMoussel

Mar 23, 2023

How can I use this model with transformers?
Would it be possible to have an example of Python code?

EIStakovskii

Owner Mar 23, 2023

•

edited Mar 23, 2023

Hi LeMoussel!
I updated the description of the model and included the python code snippet to use the model. When you use this code, make sure though that the transformers library is installed.
Hope it helps

LeMoussel

Mar 23, 2023

•

edited Mar 23, 2023

Thank you, it helped me a lot. Unfortunately, I can't find any good results.
Here is an example. Sentences #3 and #5 are not very readable (Flesch Reading Ease score: 57.52 & 34.17) so "Label_0 -> NOT ACCEPTABLE".
Should the model be more trained?

from transformers import pipeline

texts = [
  "Les scientifiques ont découvert un nouveau traitement prometteur pour lutter contre le cancer.",
  "Les chercheurs ont mis en lumière un traitement novateur qui montre des promesses dans la lutte contre le cancer.",
  "Les enfants qui jouent ensemble dans la cour de récréation s'amusent beaucoup et se font de nouveaux amis.",
  "Les enfants qui s'adonnent à des activités ludiques conjointement dans la cour de l'école éprouvent une grande satisfaction et tissent des relations amicales nouvelles.",
  "La nuit dernière, j'ai rêvé que je volais au-dessus des nuages et que j'atteignais la lune.",
  "Dans mes songes nocturnes précédents, j'ai expérimenté l'envolée au-dessus des masses nuageuses et l'accession jusqu'à notre satellite naturel."
]

classifier = pipeline("text-classification", model = 'EIStakovskii/camembert_base_fluency')
for idx , text in enumerate(texts):
  score_model = classifier(text)
  print(f"Phrase {idx} Score Camembert_base_fluency: {score_model}")

Results

Phrase 0 Score Camembert_base_fluency: [{'label': 'LABEL_1', 'score': 0.9996566772460938}]
Phrase 1 Score Camembert_base_fluency: [{'label': 'LABEL_1', 'score': 0.9997143149375916}]
Phrase 2 Score Camembert_base_fluency: [{'label': 'LABEL_1', 'score': 0.9996843338012695}]
Phrase 3 Score Camembert_base_fluency: [{'label': 'LABEL_1', 'score': 0.999721348285675}]
Phrase 4 Score Camembert_base_fluency: [{'label': 'LABEL_1', 'score': 0.9996987581253052}]
Phrase 5 Score Camembert_base_fluency: [{'label': 'LABEL_1', 'score': 0.9997372031211853}]

Flesch Reading Ease score

Phrase 0 # Syllables: 24 Flesch Reading Ease: 61.33
Phrase 1 # Syllables: 27 Flesch Reading Ease: 84.68
Phrase 2 # Syllables: 25 Flesch Reading Ease: 85.69
Phrase 3 # Syllables: 40 Flesch Reading Ease: 57.52
Phrase 4 # Syllables: 23 Flesch Reading Ease: 87.72
Phrase 5 # Syllables: 37 Flesch Reading Ease: 34.17

LeMoussel

Mar 23, 2023

This comment has been hidden

EIStakovskii

Owner Mar 23, 2023

Yes, I see what you mean. The thing is , however, that the model was not trained to evaluate the measure of how readable (complicated/easy to understand) the text is, which is what the Flesch–Kincaid readability tests employed for. Rather it was trained to detect (as accurately as it is possible) poorly generated texts, i.e. texts with extraneous characters, typos, poor spelling, poor grammar and poor word order. So in the cases of your examples the model performed as expected. Concerning its training, yes, it could be retrained but once there is some extra data for French in the form of CoLa (The Corpus of Linguistic Acceptability), where there are sentences that are labeled 1 as ungrammatical and 0 as grammatical and acceptable. Note that the dataset for this model was created artificially using an array of euristics and hand-crafted rules (e.g. shuffle the word order of each third sentence).

EIStakovskii changed discussion status to closed Mar 27, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment