nikolasmoya
/

c4-binary-english-grammar-checker

Text Classification

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

nikolasmoya commited on Sep 6, 2023

Commit

3e6cb23

•

1 Parent(s): df3589a

Update README.md

Files changed (1) hide show

README.md +29 -2

README.md CHANGED Viewed

@@ -11,8 +11,35 @@ model-index:
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # c4-binary-english-grammar-checker

   results: []
 ---
+# Usage instructions:
+The recommendation is to split the text into sentences and evaluate sentence by sentence, you can do that using spacy:
+```
+import spacy
+def clean_up_sentence(text: str) -> str:
+    text = text.replace("---", "")
+    text = text.replace("\n", " ")
+    text = text.strip()
+    if not text.endswith(('.', '!', '?', ":")):
+        # Since we are breaking a longer text into sentences ourselves, we should always end a sentence with a period.
+        text = text + "."
+    return text
+sentence_splitter = spacy.load("en_core_web_sm")
+spacy_document = sentence_splitter("This is a long text. It has two or more sentence. Spacy will break it down into sentences.")
+results = []
+for sentence in spacy_document.sents:
+    clean_text = clean_up_sentence(str(sentence))
+    classification = grammar_checker(clean_text)[0]
+    results.append({
+        "label": classification['label'],
+        "score": classification['score'],
+        "sentence": clean_text
+    })
+pd.DataFrame.from_dict(results)
+```
 # c4-binary-english-grammar-checker