m3hrdadfi commited on
Commit
8ecb874
1 Parent(s): 67feb1f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -1
README.md CHANGED
@@ -2,8 +2,77 @@
2
  language: en
3
  widget:
4
  - text: "He had also stgruggled with addiction during his time in Congress ."
 
 
 
5
  - text: "It is left to the directors to figure out hpw to bring the stry across to tye audience ."
6
 
7
  ---
8
 
9
- # Typo Detector
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  language: en
3
  widget:
4
  - text: "He had also stgruggled with addiction during his time in Congress ."
5
+ - text: "The review thoroughla assessed all aspects of JLENS SuR and CPG esign maturit and confidence ."
6
+ - text: "Letterma also apologized two his staff for the satyation ."
7
+ - text: "Vincent Jay had earlier won France 's first gold in gthe 10km biathlon sprint ."
8
  - text: "It is left to the directors to figure out hpw to bring the stry across to tye audience ."
9
 
10
  ---
11
 
12
+ # Typo Detector
13
+
14
+
15
+ ## How to use
16
+
17
+ You use this model with Transformers pipeline for NER (token-classification).
18
+
19
+ ### Installing requirements
20
+
21
+ ```bash
22
+ pip install transformers
23
+ ```
24
+
25
+ ### How to predict using pipeline
26
+
27
+ ```python
28
+ import torch
29
+ from transformers import AutoConfig, AutoTokenizer, AutoModelForTokenClassification
30
+ from transformers import pipeline
31
+
32
+
33
+ model_name_or_path = "mehrdadfi/typo-detector-distilbert-en"
34
+ config = AutoConfig.from_pretrained(model_name_or_path)
35
+ tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
36
+ model = AutoModelForTokenClassification.from_pretrained(model_name_or_path, config=config)
37
+ nlp = pipeline('token-classification', model=model, tokenizer=tokenizer, aggregation_strategy="average")
38
+ ```
39
+
40
+ ```python
41
+ sentences = [
42
+ "He had also stgruggled with addiction during his time in Congress .",
43
+ "The review thoroughla assessed all aspects of JLENS SuR and CPG esign maturit and confidence .",
44
+ "Letterma also apologized two his staff for the satyation .",
45
+ "Vincent Jay had earlier won France 's first gold in gthe 10km biathlon sprint .",
46
+ "It is left to the directors to figure out hpw to bring the stry across to tye audience .",
47
+ ]
48
+
49
+ for sentence in sentences:
50
+ typos = [sentence[r["start"]: r["end"]] for r in nlp(sentence)]
51
+
52
+ detected = sentence
53
+ for typo in typos:
54
+ detected = detected.replace(typo, f'<i>{typo}</i>')
55
+
56
+ print(" [Input]: ", sentence)
57
+ print("[Detected]: ", detected)
58
+ print("-" * 130)
59
+ ```
60
+
61
+ Output:
62
+ ```text
63
+ [Input]: He had also stgruggled with addiction during his time in Congress .
64
+ [Detected]: He had also <i>stgruggled</i> with addiction during his time in Congress .
65
+ ----------------------------------------------------------------------------------------------------------------------------------
66
+ [Input]: The review thoroughla assessed all aspects of JLENS SuR and CPG esign maturit and confidence .
67
+ [Detected]: The review <i>thoroughla</i> assessed all aspects of JLENS SuR and CPG <i>esign</i> <i>maturit</i> and confidence .
68
+ ----------------------------------------------------------------------------------------------------------------------------------
69
+ [Input]: Letterma also apologized two his staff for the satyation .
70
+ [Detected]: <i>Letterma</i> also apologized <i>two</i> his staff for the <i>satyation</i> .
71
+ ----------------------------------------------------------------------------------------------------------------------------------
72
+ [Input]: Vincent Jay had earlier won France 's first gold in gthe 10km biathlon sprint .
73
+ [Detected]: Vincent Jay had earlier won France 's first gold in <i>gthe</i> 10km biathlon sprint .
74
+ ----------------------------------------------------------------------------------------------------------------------------------
75
+ [Input]: It is left to the directors to figure out hpw to bring the stry across to tye audience .
76
+ [Detected]: It is left to the directors to figure out <i>hpw</i> to bring the <i>stry</i> across to <i>tye</i> audience .
77
+ ----------------------------------------------------------------------------------------------------------------------------------
78
+ ```