patrickvonplaten commited on
Commit
14f91f3
1 Parent(s): 9b5bc8e
README.md ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ This is an example of how a kenLM model can be downloaded with [PyCTCDecode](https://github.com/kensho-technologies/pyctcdecode) .
2
+
3
+ Simply run the following code:
4
+
5
+ ```python
6
+ from pyctcdecode import BeamSearchDecoderCTC
7
+
8
+ decoder = BeamSearchDecoderCTC.load_from_hf_hub("kensho/beamsearch_decoder_dummy")
9
+ ```
10
+
11
+ The model was created by [Patrick von Platen](https://huggingface.co/patrickvonplaten) for demonstration purposes.
alphabet.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"is_bpe": false, "labels": ["<unk>", "<s>", "</s>", "bugs", "bunny"]}
language_model/attrs.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"alpha": 0.5, "beta": 1.5, "unk_score_offset": -10.0, "score_boundary": true}
language_model/bugs_bunny_kenlm.arpa ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ \data\
2
+ ngram 1=5
3
+ ngram 2=5
4
+
5
+ \1-grams:
6
+ -10 <unk> 0
7
+ 0 <s> 0
8
+ 0 </s> 0
9
+ 0 bugs 0
10
+ 0 bunny 0
11
+
12
+ \2-grams:
13
+ 0 bunny </s>
14
+ -10 bugs </s>
15
+ 0 <s> bugs
16
+ -10 <s> bunny
17
+ 0 bugs bunny
18
+
19
+ \end\
language_model/unigrams.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ <unk>
2
+ <s>
3
+ </s>
4
+ bugs
5
+ bunny