AlienKevin
commited on
Commit
•
618458e
1
Parent(s):
89d5a75
Update README.md
Browse files
README.md
CHANGED
@@ -5,17 +5,14 @@ language:
|
|
5 |
pipeline_tag: image-to-text
|
6 |
---
|
7 |
|
8 |
-
Convert
|
9 |
-
|
10 |
-
# Target: Convert Scanned IPA symbols to Pinyin
|
11 |
Scanned images of IPA phonetic symbols for Chengdunese (成都话) in The Great Dictionary of Modern Chinese Dialects (現代漢語方言大詞典).
|
12 |
|
13 |
-
TODO: labeled part of the test set.
|
14 |
-
|
15 |
# Training and Test Set
|
16 |
* 2,553 images of IPA phonetic symbols generated from Pinyin pronunciations found in Sichuanese Dialect Dictionary (四川方言词典 教你一口地道的四川话) and the word list of the Shupin (蜀拼) input method.
|
17 |
* 80/20 split on train/test
|
18 |
|
19 |
# Results
|
20 |
* Trained for 180 steps with a batch size of 32
|
21 |
-
* Final Character Error Rate of 0.795%
|
|
|
|
5 |
pipeline_tag: image-to-text
|
6 |
---
|
7 |
|
8 |
+
# Target: Convert Scanned Images of IPA symbols to Pinyin
|
|
|
|
|
9 |
Scanned images of IPA phonetic symbols for Chengdunese (成都话) in The Great Dictionary of Modern Chinese Dialects (現代漢語方言大詞典).
|
10 |
|
|
|
|
|
11 |
# Training and Test Set
|
12 |
* 2,553 images of IPA phonetic symbols generated from Pinyin pronunciations found in Sichuanese Dialect Dictionary (四川方言词典 教你一口地道的四川话) and the word list of the Shupin (蜀拼) input method.
|
13 |
* 80/20 split on train/test
|
14 |
|
15 |
# Results
|
16 |
* Trained for 180 steps with a batch size of 32
|
17 |
+
* Final Character Error Rate of 0.795% on test set
|
18 |
+
* TODO: label part of the scanned images to see if model generalizes on target task
|