Pclanglais commited on
Commit
013b723
1 Parent(s): a2d551d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -2
README.md CHANGED
@@ -6,7 +6,10 @@ Estienne was trained on 2,000 example of manually annotated texts, excerpted at
6
 
7
  Given the diversity of the corpus, Estienne should work out on diverse document formats in European languages.
8
 
9
- As Deberta remove newline by default and has no support for it in the tokenizer, they should be replaced by pilcrows (¶)
 
 
 
10
 
11
  Estienne supports the following segmentations:
12
  * **Text**
@@ -21,4 +24,4 @@ Estienne supports the following segmentations:
21
  * **Date** - statement of date and time, common in letters and newspaper articles.
22
  * **Keyword** - list of keywords, especially common in scientific publications.
23
 
24
- The model is named in reference to the humanist Henri Estienne who introduced many practices of text segmentation still in use in scholarly edition today.
 
6
 
7
  Given the diversity of the corpus, Estienne should work out on diverse document formats in European languages.
8
 
9
+ The model is named in reference to the humanist Henri Estienne who introduced many practices of text segmentation still in use in scholarly edition today.
10
+
11
+ ## Use
12
+ As Deberta remove newline by default and has no support for it in the tokenizer, they should be replaced by pilcrows (¶).
13
 
14
  Estienne supports the following segmentations:
15
  * **Text**
 
24
  * **Date** - statement of date and time, common in letters and newspaper articles.
25
  * **Keyword** - list of keywords, especially common in scientific publications.
26
 
27
+ ## Example