gabriel-p commited on
Commit
b0605aa
1 Parent(s): 69558da

Update README

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md CHANGED
@@ -1,3 +1,34 @@
1
  ---
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+
5
+
6
+ .
7
+ # Spanish truecasing model
8
+
9
+ This is a Spanish truecasing-model that works with the <b>Dalton Fury</b> Python project:
10
+
11
+ https://github.com/daltonfury42/truecase
12
+
13
+ You can install it here:
14
+
15
+ https://pypi.org/project/truecase/
16
+
17
+ ## Quick start
18
+
19
+ To use the Spanish model use the TrueCase.py file uploaded to this repository
20
+
21
+ https://huggingface.co/HURIDOCS/spanish_truecasing/blob/main/TrueCaser.py
22
+
23
+
24
+
25
+ ## Notes
26
+
27
+ The model was trained with the Europarl dataset that contains transcriptions of the European Parliament discusions:
28
+
29
+ https://www.statmt.org/europarl/
30
+ Europarl: A Parallel Corpus for Statistical Machine Translation, Philipp Koehn, MT Summit 2005
31
+
32
+ Using huggingface load_dataset:
33
+
34
+ europarl = load_dataset('large_spanish_corpus', name='Europarl')