nbest-lattice

nbest-lattice

NAME

nbest-lattice - rescore N-best lists and lattices

SYNOPSIS

nbest-lattice [-help] option ...

DESCRIPTION

nbest-lattice rescores N-best lists or optimizes word-level recognition scores (as opposed to sentence-level scores). There are two rescoring modes. In N-best word error minimization mode, the program computes the posterior expected word error for each hypothesis relative to all hypotheses in the N-best list, choosing the one with the lowest value.

In lattice word error minimization mode, the program constructs a word lattice from all the N-best hypotheses and extracts the path with the lowest expected word error. This is similar to N-best word error minimization but allows hypotheses not contained in the N-best list. A variant of this mode uses a word ``mesh'' instead of a word lattice, in which all hypotheses are aligned into a grid of word positions, and one is allowed to chose a word from each grid position, thus allowing an even greater number of potential hypotheses.

Each filename argument can be an ASCII file, or a compressed file (name ending in .Z or .gz), or ``-'' to indicate stdin/stdout.

OPTIONS

-help
Print option summary.
-debug level
Controls the amount of output (the higher the level, the more). The nature of the output depends on the rescoring mode.
-wer
Chooses N-best word error minimization mode. The default is lattice word error minimization.
-lattice-wer
Chooses lattice word error minimization mode (the default).
-rescore file
Reads the N-best list from file. The N-best list can be in either of three formats. An N-best list in Decipher(TM) format consists of the header
NBestList1.0
followed by one or more lines of the form
(score) w1 w2 w3 ...
where score is a composite acoustic/language model score from the recognizer, on the bytelog scale. This format is output by the Decipher recognizer, as well as by the ngram(1) option -nbest.
If the header is of the form
NBestList2.0
the hypotheses are expected to be in the format
(score) w1 ( st: st1 et: et1 g: g1 a: a1 ) w2 ...
where words are followed by start and end times, language model and acoustic scores (bytelog-scaled), respectively.
An alternative N-best list format lists hypotheses as
ascore lscore nwords w1 w2 w3 ...
where the first three columns contain the acoustic model log probability, the language model log probability, and the number of words in the hypothesis string, respectively. (This format must not be preceded by an ``NBestList'' header.) The alternative format is output by the ngram(1) option -rescore.
-nbest file
A synonym for -rescore.
-write-nbest file
Outputs the N-best list to a file, after sorting and processing (for validation purposes).
-nbest-files file-list
Rescores multiple N-nbest lists whose filenames are read from file-list.
-max-nbest n
Limits the number of hypotheses read from each N-best list to the first n.
-max-rescore m
Only choose among the top m hypotheses when optimizing word error. This is convenient to limit computation for long N-best lists. The cutoff is made after reading all hypotheses (subject to -max-nbest) and reordering them according to the posterior probabilities.
The time taken in N-nbest error minimization is proportional to m times n, where n is the length of the N-best list (or the value given to -max-nbest).
-posterior-prune threshold
Don't process N-best hypotheses whose cumulative posterior probability is below threshold. This is another strategy to speed up the algorithm.
-rescore-lmw lmw
Sets the language model weight used in combining the language model log probabilities with acoustic log probabilities (only relevant if separate scores are given in the N-best input).
-rescore-wtw wtw
Sets the word transition weight used to weight the number of words relative to the acoustic log probabilities (only relevant if separate scores are given in the N-best input).
-posterior-scale scale
Divide the total weighted log score by scale when computing normalized posterior probabilities. This controls the peakedness of the posterior distribution. The default value is whatever was chosen for lmw, so that language model scores are scaled to have weight 1, and acoustic scores have weight 1/lmw.
-prime-lattice
Start building the lattice with the best hypothesis obtained from N-best error minimization. This produces slightly better alignments and sometimes lower error rates. The default is to start with the top-scoring hypothesis.
-use-mesh
Use a word mesh instead of a word lattice.
-vocab file
Read the N-best list vocabulary from file. This option is mostly redundant since words found in the N-best input are implicitly added to the vocabulary.
-tolower
Map vocabulary to lowercase, eliminating case distinctions.
-noise noise-tag
Designate noise-tag as a vocabulary item that is to be ignored in aligning hypotheses with each other (the same as the -pau- word). This is typically used to identify a noise marker.

The following options only affect word correct maximization mode.

-read file
Reads an initial lattice from file, to be merged with additional paths constructed from the N-best hypotheses.
-write file
Writes the resulting N-best lattice to file.
-no-merge
Build a lattice from the N-nbest hypotheses without merging edges (string/lattice alignment). This creates a lattice with one disjoint path per hypothesis, and is useful mainly for debugging purposes.
-no-reorder
Process N-best hypotheses in the order in which they appear. By default, hypotheses are first sorted by their aggregate scores.
-lattice-errors references
Compute the lattice error (minimum word error) of the lattice read with -read or built with -nbest. Error is computed relative to all the hypotheses contained in a separate N-best list given by references.
-nbest-errors references
Compute the N-best error (minimum word error) of the N-best list read with -nbest. Error is computed relative to all the hypotheses contained in a separate N-best list given by references. Pause and noise tokens (as specified with -noise) in the N-best list are ignored.
-dump-posteriors
Output posterior probabilities instead of word hypotheses. In N-nbest mode, only the posterior probability for each hypothesis is output. In lattice mode, the hyp posterior is followed by word posterior probabilities for each (non-pause, non-noise) token in the hypothesis.

SEE ALSO

ngram(1).
A. Stolcke, Y. Konig, and M. Weintraub, ``Explicit Word Error Minimization in N-best List Rescoring,'' Proc. Eurospeech, 163-166, 1997.

BUGS

AUTHOR

Andreas Stolcke <stolcke@speech.sri.com>.
Copyright 1996-1998 SRI International