patrickvonplaten
commited on
Commit
•
14f91f3
1
Parent(s):
9b5bc8e
up
Browse files- README.md +11 -0
- alphabet.json +1 -0
- language_model/attrs.json +1 -0
- language_model/bugs_bunny_kenlm.arpa +19 -0
- language_model/unigrams.txt +5 -0
README.md
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
This is an example of how a kenLM model can be downloaded with [PyCTCDecode](https://github.com/kensho-technologies/pyctcdecode) .
|
2 |
+
|
3 |
+
Simply run the following code:
|
4 |
+
|
5 |
+
```python
|
6 |
+
from pyctcdecode import BeamSearchDecoderCTC
|
7 |
+
|
8 |
+
decoder = BeamSearchDecoderCTC.load_from_hf_hub("kensho/beamsearch_decoder_dummy")
|
9 |
+
```
|
10 |
+
|
11 |
+
The model was created by [Patrick von Platen](https://huggingface.co/patrickvonplaten) for demonstration purposes.
|
alphabet.json
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
{"is_bpe": false, "labels": ["<unk>", "<s>", "</s>", "bugs", "bunny"]}
|
language_model/attrs.json
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
{"alpha": 0.5, "beta": 1.5, "unk_score_offset": -10.0, "score_boundary": true}
|
language_model/bugs_bunny_kenlm.arpa
ADDED
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
\data\
|
2 |
+
ngram 1=5
|
3 |
+
ngram 2=5
|
4 |
+
|
5 |
+
\1-grams:
|
6 |
+
-10 <unk> 0
|
7 |
+
0 <s> 0
|
8 |
+
0 </s> 0
|
9 |
+
0 bugs 0
|
10 |
+
0 bunny 0
|
11 |
+
|
12 |
+
\2-grams:
|
13 |
+
0 bunny </s>
|
14 |
+
-10 bugs </s>
|
15 |
+
0 <s> bugs
|
16 |
+
-10 <s> bunny
|
17 |
+
0 bugs bunny
|
18 |
+
|
19 |
+
\end\
|
language_model/unigrams.txt
ADDED
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
<unk>
|
2 |
+
<s>
|
3 |
+
</s>
|
4 |
+
bugs
|
5 |
+
bunny
|