nevmenandr commited on
Commit
0b77ddf
1 Parent(s): ff24464

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -23,7 +23,7 @@ The texts for the training corpus are taken from two datasets published in the [
23
 
24
  Only texts published after 1835 (the era of realism) remain in the corpus.
25
 
26
- The texts are marked up using the Russian version of the booknlp library, which highlighted the characters of the fictional works.
27
 
28
  Each character in the text was replaced by its id of kind:
29
 
 
23
 
24
  Only texts published after 1835 (the era of realism) remain in the corpus.
25
 
26
+ The texts are marked up using the Russian version of the booknlp library, which highlighted the characters of the fictional works. Texts presented in old orthography have been converted to modern orthography with the help of a [package](https://pypi.org/project/prereform2modern/).
27
 
28
  Each character in the text was replaced by its id of kind:
29