nbest-pron-score
nbest-pron-score
NAME
nbest-pron-score - score pronunciations and pauses in N-best hypotheses
SYNOPSIS
nbest-pron-score
[-help]
option
...
DESCRIPTION
nbest-pron-score
reads N-best lists and computes log probability scores for the pronunciations
and pauses contained in them.
Pronunciation scoring requires that the N-best lists
contain phone backtraces in "NBestList2.0"
nbest-format(5).
Pronunciation scores are computed from the probabilities in a dictionary.
Pauses are binned into three length classes (none, short, long) and
scored according to a trigram language model that conditions the pause length
on the left and right neighboring words, in that order (so that bigram
backoff uses the left neighbor only).
OPTIONS
- -help
-
Print option summary.
- -debug level
-
Controls the amount of output (the higher the
level,
the more).
- -tolower
-
Map all vocabulary to lowercase.
Useful if case conventions for text/counts and language model differ.
- -multiwords
-
Deal with N-best lists containing multiwords joined by underscores.
This only affects pause scoring: if a word adjacent to a pause is
a multiword and is not in the vocabulary of the pause LM, then it is split
and only the component closest to the pause is conditioned on.
- -nbest file
-
Score the N-best hypothese in
file.
- -rescore file
-
Same as
-nbest.
- -nbest-files file
-
Process all N-best list filenames listed in
file.
- -max-nbest n
-
Limits the number of hypotheses read from an N-best list.
Only the first
n
hypotheses are processed.
- -dictionary file
-
Enable pronunciation scoring, using the pronunciation dictionary
file.
Each line contains a pronunciation in the format
word [p] phone ...
The optional value
p
is the pronunciation probability.
If the second field in a line is not a number the pronunciation is assumed
to have probability one.
- -intlogs
-
Interpret probabilities in the dictionary as intlog-scaled log probabilities
(as used in the SRI Decipher(TM) system), rather than straight probabilities.
- -pause-lm file
-
Enable pause scoring, using the pause LM in
file.
- -no-pause tag
-
The word used to represent the absence of a pause in the pause LM.
- -short-pause tag
-
The word used to represent a short pause in the pause LM.
- -long-pause tag
-
The word used to represent a long pause in the pause LM.
- -min-pause-dur T
-
The minimum duration, in seconds, for a non-speech region to be considered
a (short) pause.
- -long-pause-dur T
-
The duration, in second, above which a non-speech region is considered a
"long" pause.
The default values for pause tags and duration thresholds are printed by the
-help
option.
- -pron-score-dir dir
-
Write pronunciation scores to
dir
when processing multiple N-best lists,
using output filenames derived from the input files.
- -pause-score-dir dir
-
Write pause scores to
dir
when processing multiple N-best lists,
using output filenames derived from the input files.
SEE ALSO
nbest-format(5), nbest-scripts(1), nbest-optimize(1), ngram(1).
A. Stolcke et al., ``The SRI RT-02 Speech-to-Text System'',
Rich Transcription Workshop, Vienna, VA, April 2002.
BUGS
The binning of pause lengths into three classes should be generalized.
AUTHOR
Andreas Stolcke <stolcke@speech.sri.com>.
Copyright 2002 SRI International