Spaces:
Runtime error
Runtime error
Updating diversity calculation
Browse files
app.py
CHANGED
@@ -480,12 +480,12 @@ with gr.Blocks(title="Automatic Literacy and Speech Assesmen") as demo:
|
|
480 |
to understand.
|
481 |
""")
|
482 |
gr.Markdown("""**Lexical Diversity**- The lexical diversity score is computed by taking the ratio of unique similar words to total similar words
|
483 |
-
|
484 |
practice to repeat the same words when it's possible not to. Vocabulary diversity is generally computed by taking the ratio of unique
|
485 |
strings/ total strings. This does not give an indication if the person has a large vocabulary or if the topic does not require a diverse
|
486 |
vocabulary to express it. This algorithm only scores the text based on how many times a unique word was chosen for a semantic idea, e.g.,
|
487 |
"Forest" and "Woods" are 2 words to represent one semantic idea, so this would receive a 100% lexical diversity score, vs using the word
|
488 |
-
"Forest" twice would yield you a 25% diversity score, (1 unique word/ 2 total words)
|
489 |
""")
|
490 |
gr.Markdown("""**Speech Pronunciation Scoring-**- The Wave2Vec 2.0 model is utilized to convert audio into text in real-time. The model predicts words or phonemes
|
491 |
(smallest unit of speech distinguishing one word (or word element) from another) from the input audio from the user. Due to the nature of the model,
|
|
|
480 |
to understand.
|
481 |
""")
|
482 |
gr.Markdown("""**Lexical Diversity**- The lexical diversity score is computed by taking the ratio of unique similar words to total similar words
|
483 |
+
. The similarity is computed as if the cosine similarity of the word2vec embeddings is greater than .75. It is bad writing/speech
|
484 |
practice to repeat the same words when it's possible not to. Vocabulary diversity is generally computed by taking the ratio of unique
|
485 |
strings/ total strings. This does not give an indication if the person has a large vocabulary or if the topic does not require a diverse
|
486 |
vocabulary to express it. This algorithm only scores the text based on how many times a unique word was chosen for a semantic idea, e.g.,
|
487 |
"Forest" and "Woods" are 2 words to represent one semantic idea, so this would receive a 100% lexical diversity score, vs using the word
|
488 |
+
"Forest" twice would yield you a 25% diversity score, (1 unique word/ 2 total words)
|
489 |
""")
|
490 |
gr.Markdown("""**Speech Pronunciation Scoring-**- The Wave2Vec 2.0 model is utilized to convert audio into text in real-time. The model predicts words or phonemes
|
491 |
(smallest unit of speech distinguishing one word (or word element) from another) from the input audio from the user. Due to the nature of the model,
|