Update README.md
Browse files
README.md
CHANGED
@@ -4,12 +4,15 @@ This model takes in a word as an input and splits it into syllables. I did this
|
|
4 |
## Calling the Model
|
5 |
```python
|
6 |
from transformers import AutoTokenizer, T5ForConditionalGeneration
|
|
|
7 |
model = T5ForConditionalGeneration.from_pretrained('imjeffhi/syllabizer')
|
8 |
tokenizer = AutoTokenizer.from_pretrained('imjeffhi/syllabizer')
|
|
|
9 |
def generate_output(word):
|
10 |
tokens = tokenizer(word, return_tensors='pt')
|
11 |
output = model.generate(**tokens, do_sample=False, max_length=30, early_stopping=True)[0]
|
12 |
return tokenizer.decode(output, skip_special_tokens=True)
|
|
|
13 |
syllables = generate_output('syllabizer')
|
14 |
```
|
15 |
The model returns syllables in spaced format. See output below.
|
@@ -20,10 +23,13 @@ syl la biz er
|
|
20 |
You can easily syllabize an entire sentence/paragraph and/or convert the output into a list of syllables with the following code:
|
21 |
```python
|
22 |
from transformers import pipeline
|
|
|
23 |
syllabizer_pipe = pipeline('text2text-generation', model = 'imjeffhi/syllabizer', tokenizer='imjeffhi/syllabizer')
|
|
|
24 |
sentence = "A unit of spoken language consisting of a single uninterrupted sound formed by a vowel, diphthong, or syllabic consonant alone, or by any of these sounds preceded, followed, or surrounded by one or more consonants."
|
25 |
words = sentence.split(" ")
|
26 |
output = syllabizer_pipe(words, batch_size=len(words),do_sample=False, max_length=30, early_stopping=True)
|
|
|
27 |
[{words[i]: gen_text['generated_text'].split(" ")} for i, gen_text in enumerate(output)]
|
28 |
```
|
29 |
|
|
|
4 |
## Calling the Model
|
5 |
```python
|
6 |
from transformers import AutoTokenizer, T5ForConditionalGeneration
|
7 |
+
|
8 |
model = T5ForConditionalGeneration.from_pretrained('imjeffhi/syllabizer')
|
9 |
tokenizer = AutoTokenizer.from_pretrained('imjeffhi/syllabizer')
|
10 |
+
|
11 |
def generate_output(word):
|
12 |
tokens = tokenizer(word, return_tensors='pt')
|
13 |
output = model.generate(**tokens, do_sample=False, max_length=30, early_stopping=True)[0]
|
14 |
return tokenizer.decode(output, skip_special_tokens=True)
|
15 |
+
|
16 |
syllables = generate_output('syllabizer')
|
17 |
```
|
18 |
The model returns syllables in spaced format. See output below.
|
|
|
23 |
You can easily syllabize an entire sentence/paragraph and/or convert the output into a list of syllables with the following code:
|
24 |
```python
|
25 |
from transformers import pipeline
|
26 |
+
|
27 |
syllabizer_pipe = pipeline('text2text-generation', model = 'imjeffhi/syllabizer', tokenizer='imjeffhi/syllabizer')
|
28 |
+
|
29 |
sentence = "A unit of spoken language consisting of a single uninterrupted sound formed by a vowel, diphthong, or syllabic consonant alone, or by any of these sounds preceded, followed, or surrounded by one or more consonants."
|
30 |
words = sentence.split(" ")
|
31 |
output = syllabizer_pipe(words, batch_size=len(words),do_sample=False, max_length=30, early_stopping=True)
|
32 |
+
|
33 |
[{words[i]: gen_text['generated_text'].split(" ")} for i, gen_text in enumerate(output)]
|
34 |
```
|
35 |
|