nbest-pron-score

nbest-pron-score

NAME

nbest-pron-score - score pronunciations and pauses in N-best hypotheses

SYNOPSIS

nbest-pron-score [-help] option ...

DESCRIPTION

nbest-pron-score reads N-best lists and computes log probability scores for the pronunciations and pauses contained in them. Pronunciation scoring requires that the N-best lists contain phone backtraces in "NBestList2.0" nbest-format(5).

Pronunciation scores are computed from the probabilities in a dictionary. Pauses are binned into three length classes (none, short, long) and scored according to a trigram language model that conditions the pause length on the left and right neighboring words, in that order (so that bigram backoff uses the left neighbor only).

OPTIONS

-help
Print option summary.
-debug level
Controls the amount of output (the higher the level, the more).
-tolower
Map all vocabulary to lowercase. Useful if case conventions for text/counts and language model differ.
-multiwords
Deal with N-best lists containing multiwords joined by underscores. This only affects pause scoring: if a word adjacent to a pause is a multiword and is not in the vocabulary of the pause LM, then it is split and only the component closest to the pause is conditioned on.
-nbest file
Score the N-best hypothese in file.
-rescore file
Same as -nbest.
-nbest-files file
Process all N-best list filenames listed in file.
-max-nbest n
Limits the number of hypotheses read from an N-best list. Only the first n hypotheses are processed.
-dictionary file
Enable pronunciation scoring, using the pronunciation dictionary file. Each line contains a pronunciation in the format
word [p] phone ...
The optional value p is the pronunciation probability. If the second field in a line is not a number the pronunciation is assumed to have probability one.
-intlogs
Interpret probabilities in the dictionary as intlog-scaled log probabilities (as used in the SRI Decipher(TM) system), rather than straight probabilities.
-pause-lm file
Enable pause scoring, using the pause LM in file.
-no-pause tag
The word used to represent the absence of a pause in the pause LM.
-short-pause tag
The word used to represent a short pause in the pause LM.
-long-pause tag
The word used to represent a long pause in the pause LM.
-min-pause-dur T
The minimum duration, in seconds, for a non-speech region to be considered a (short) pause.
-long-pause-dur T
The duration, in second, above which a non-speech region is considered a "long" pause.

The default values for pause tags and duration thresholds are printed by the -help option.

-pron-score-dir dir
Write pronunciation scores to dir when processing multiple N-best lists, using output filenames derived from the input files.
-pause-score-dir dir
Write pause scores to dir when processing multiple N-best lists, using output filenames derived from the input files.

SEE ALSO

nbest-format(5), nbest-scripts(1), nbest-optimize(1), ngram(1).
A. Stolcke et al., ``The SRI RT-02 Speech-to-Text System'', Rich Transcription Workshop, Vienna, VA, April 2002.

BUGS

The binning of pause lengths into three classes should be generalized.

AUTHOR

Andreas Stolcke <stolcke@speech.sri.com>.
Copyright 2002 SRI International