segment-nbest
segment-nbest
NAME
segment-nbest - rescore and segment N-best lists using N-gram language models
SYNOPSIS
segment-nbest
[-help]
option
...
nbest-file-list
...
DESCRIPTION
segment-nbest
processes a series of consecutive N-best lists from a speech
recognizer
and applies a hidden segment N-gram language model to them.
The language model is a standard backoff N-gram model in ARPA
ngram-format(5)
modeling segmentation using the boundary tags <s> and </s>.
The program reads in all N-best lists and outputs the
hypotheses that have the highest aggregate score (both acoustic
and language model).
Hypothesized segment boundaries are marked by <s> tags.
Each filename argument can be an ASCII file, or a
compressed file (name ending in .Z or .gz), or ``-'' to indicate
stdin/stdout.
OPTIONS
- -help
-
Print option summary.
- -order n
-
Set the maximal N-gram order to be used, by default 3.
NOTE: The order of the model is not set automatically when a model
file is read, so the same file can be used at various orders.
- -debug level
-
Set the debugging output level (0 means no debugging output).
Debugging messages are sent to stderr.
- -lm file
-
Read the N-gram model from
file.
- -tolower
-
Map all vocabulary to lowercase.
Useful if case conventions for N-best lists and language model differ.
- -mix-lm file
-
Read a second, standard N-gram model for interpolation purposes.
- -lambda weight
-
Set the weight of the main model when interpolating with
-mix-lm.
Default value is 0.5.
- -bayes length
-
Interpolate the second and the main model using posterior probabilities
for local N-gram-contexts of length
length.
The
-lambda
value is used as a prior mixture weight in this case.
- -bayes-scale scale
-
Set the exponential scale factor on the context likelihood in conjunction
with the
-bayes
function.
Default value is 1.0.
- -nbest-files list
-
Specifies a list of N-best files.
The file
list
should contain a list of filenames, one per line,
each corresponding to an N-best file in one of the formats
described in
nbest-format(5).
The N-best files should correspond to consecutive speech waveforms
in the order listed.
- -fb-rescore
-
Perform Forward-backward rescoring.
This generates new N-best lists
as output whose LM scores reflect the posterior probability of each
hypothesis.
The default is to perform Viterbi rescoring and output only the
best combined hypothesis.
- -max-nbest n
-
Limits the number of hypotheses read from each N-best list to the first
n.
- -max-rescore m
-
Only choose among the top
m
hypotheses of each list (after reordering hypotheses, see below).
This is an effective way to limit the quadratic computation
of the Viterbi or forward/backward dynamic programming.
- -no-reorder
-
Do not reorder the hypotheses before limiting the computation to
the top
m.
By default the hypotheses will first be sorted according to the
acoustic and language model scores recorded in the N-best lists.
- -rescore-lmw weight
-
Specifies the language model weight to be use in combining
acoustic and language model scores to select the best hypotheses.
- -rescore-wtw weight
-
Specifies the word transition weight to be used in selecting the
best hypotheses.
- -noise noise-tag
-
Designate
noise-tag
as a vocabulary item that is to be ignored by the LM.
(This is typically used to identify a noise marker.)
- -noise-vocab file
-
Read several noise tags from
file,
instead of, or in addition to, the single noise tag specified by
-noise.
- -decipher-lm model-file
-
Designates the N-gram backoff model (typically a bigram) that was used by the
Decipher(TM) recognizer in computing composite scores.
Used to compute acoustic scores from the composite scores if the
N-best lists are in "NBestList1.0" format.
- -decipher-lmw weight
-
Specifies the language model weight used by the recognizer.
Used to compute acoustic scores from the composite scores.
- -decipher-wtw weight
-
Specifies the word transition weight used by the recognizer.
Used to compute acoustic scores from the composite scores.
- -stag string
-
Use
string
to mark segment boundaries in the output.
Default is the start-of-sentence symbol defined in the language model (<s>).
- -bias b
-
Make a segment boundary a priori more likely by a factor of
b.
segment-nbest
will also process any command line arguments following the options
as lists of N-best lists, as with the
-nbest-files
option.
Each
nbest-file-list
will be processed in turn,
with individual output delimited by a line of the form
<nbestfile nbest-file-list>
SEE ALSO
ngram-count(1), segment(1), ngram-format(5), nbest-format(5).
A. Stolcke, ``Modeling Linguistic Segment and Turn Boundaries for N-best
Rescoring of Spontaneous Speech,'' Proc. Eurospeech, 2779-2782, 1997.
BUGS
Only N-gram models up to trigram order are used accurately.
AUTHOR
Andreas Stolcke <stolcke@speech.sri.com>.
Copyright 1997 SRI International