nbest-lattice
nbest-lattice
NAME
nbest-lattice - rescore N-best lists and lattices
SYNOPSIS
nbest-lattice
[-help]
option
...
DESCRIPTION
nbest-lattice
rescores N-best lists or optimizes word-level recognition scores
(as opposed to sentence-level scores).
There are two rescoring modes.
In
N-best word error minimization
mode, the program computes the posterior expected word error for each
hypothesis relative to all hypotheses in the N-best list, choosing the one
with the lowest value.
In
lattice word error minimization
mode, the program constructs a word lattice from all the N-best hypotheses
and extracts the path with the lowest expected word error.
This is similar to N-best word error minimization but allows
hypotheses not contained in the N-best list.
A variant of this mode uses a word ``mesh'' instead of a word lattice,
in which all hypotheses are aligned into a grid of word positions,
and one is allowed to chose a word from each grid position, thus allowing an
even greater number of potential hypotheses.
Each filename argument can be an ASCII file, or a
compressed file (name ending in .Z or .gz), or ``-'' to indicate
stdin/stdout.
OPTIONS
- -help
-
Print option summary.
- -debug level
-
Controls the amount of output (the higher the
level,
the more).
The nature of the output depends on the rescoring mode.
- -wer
-
Chooses N-best word error minimization mode.
The default is lattice word error minimization.
- -lattice-wer
-
Chooses lattice word error minimization mode (the default).
- -rescore file
-
Reads the N-best list from
file.
The N-best list can be in either of three formats.
An N-best list in Decipher(TM) format consists of the header
NBestList1.0
followed by one or more lines of the form
(score) w1 w2 w3 ...
where
score
is a composite acoustic/language model score
from the recognizer, on the bytelog scale.
This format is output by the Decipher recognizer, as well as
by the
ngram(1)
option
-nbest.
If the header is of the form
NBestList2.0
the hypotheses are expected to be in the format
(score) w1 ( st: st1 et: et1 g: g1 a: a1 ) w2 ...
where words are followed by start and end times, language model and
acoustic scores (bytelog-scaled), respectively.
An alternative N-best list format lists hypotheses as
ascore lscore nwords w1 w2 w3 ...
where the first three columns contain the
acoustic model log probability, the language model log probability,
and the number of words in the hypothesis string, respectively.
(This format must not be preceded by an ``NBestList'' header.)
The alternative format is output by the
ngram(1)
option
-rescore.
- -nbest file
-
A synonym for
-rescore.
- -write-nbest file
-
Outputs the N-best list to a file, after sorting and processing
(for validation purposes).
- -nbest-files file-list
-
Rescores multiple N-nbest lists whose filenames are read from
file-list.
- -max-nbest n
-
Limits the number of hypotheses read from each N-best list to the first
n.
- -max-rescore m
-
Only choose among the top
m
hypotheses when optimizing word error.
This is convenient to limit computation for long N-best lists.
The cutoff is made after reading all hypotheses (subject to
-max-nbest)
and reordering them according to the posterior probabilities.
The time taken in N-nbest error minimization is proportional to
m
times
n,
where
n
is the length of the N-best list (or the value given to
-max-nbest).
- -posterior-prune threshold
-
Don't process N-best hypotheses whose cumulative posterior probability
is below
threshold.
This is another strategy to speed up the algorithm.
- -rescore-lmw lmw
-
Sets the language model weight used in combining the language model log
probabilities with acoustic log probabilities
(only relevant if separate scores are given in the N-best input).
- -rescore-wtw wtw
-
Sets the word transition weight used to weight the number of words relative to
the acoustic log probabilities
(only relevant if separate scores are given in the N-best input).
- -posterior-scale scale
-
Divide the total weighted log score by
scale
when computing normalized posterior probabilities.
This controls the peakedness of the posterior distribution.
The default value is whatever was chosen for
lmw,
so that language model scores are scaled to have weight 1,
and acoustic scores have weight 1/lmw.
- -prime-lattice
-
Start building the lattice with the best hypothesis obtained from
N-best error minimization. This produces slightly better alignments
and sometimes lower error rates. The default is to start with the
top-scoring hypothesis.
- -use-mesh
-
Use a word mesh instead of a word lattice.
- -vocab file
-
Read the N-best list vocabulary from
file.
This option is mostly redundant since words found in the N-best input
are implicitly added to the vocabulary.
- -tolower
-
Map vocabulary to lowercase, eliminating case distinctions.
- -noise noise-tag
-
Designate
noise-tag
as a vocabulary item that is to be ignored in aligning hypotheses with
each other (the same as the -pau- word).
This is typically used to identify a noise marker.
The following options only affect word correct maximization mode.
- -read file
-
Reads an initial lattice from
file,
to be merged with additional paths constructed from the
N-best hypotheses.
- -write file
-
Writes the resulting N-best lattice to
file.
- -no-merge
-
Build a lattice from the N-nbest hypotheses without merging edges
(string/lattice alignment). This creates a lattice with one disjoint path
per hypothesis, and is useful mainly for debugging purposes.
- -no-reorder
-
Process N-best hypotheses in the order in which they appear.
By default, hypotheses are first sorted by their aggregate scores.
- -lattice-errors references
-
Compute the lattice error (minimum word error) of the lattice read with
-read
or built with
-nbest.
Error is computed relative to all the hypotheses contained in a separate
N-best list given by
references.
- -nbest-errors references
-
Compute the N-best error (minimum word error) of the N-best list read with
-nbest.
Error is computed relative to all the hypotheses contained in a separate
N-best list given by
references.
Pause and noise tokens (as specified with
-noise)
in the N-best list are ignored.
- -dump-posteriors
-
Output posterior probabilities instead of word hypotheses.
In N-nbest mode, only the posterior probability for each hypothesis is output.
In lattice mode, the hyp posterior is followed by word posterior probabilities
for each (non-pause, non-noise) token in the hypothesis.
SEE ALSO
ngram(1).
A. Stolcke, Y. Konig, and M. Weintraub,
``Explicit Word Error Minimization in N-best List Rescoring,''
Proc. Eurospeech, 163-166, 1997.
BUGS
AUTHOR
Andreas Stolcke <stolcke@speech.sri.com>.
Copyright 1996-1998 SRI International