segment-nbest

segment-nbest

NAME

segment-nbest - rescore and segment N-best lists using N-gram language models

SYNOPSIS

segment-nbest [-help] option ... nbest-file-list ...

DESCRIPTION

segment-nbest processes a series of consecutive N-best lists from a speech recognizer and applies a hidden segment N-gram language model to them. The language model is a standard backoff N-gram model in ARPA format (as created by ngram-count(1)) modeling segmentation using the boundary tags <s> and </s>. The program reads in all N-best lists and outputs the hypotheses that have the highest aggregate score (both acoustic and language model). Hypothesized segment boundaries are marked by <s> tags.

Each filename argument can be an ASCII file, or a compressed file (name ending in .Z or .gz), or ``-'' to indicate stdin/stdout.

OPTIONS

-help
Print option summary.
-order n
Set the maximal N-gram order to be used, by default 3. NOTE: The order of the model is not set automatically when a model file is read, so the same file can be used at various orders.
-debug level
Set the debugging output level (0 means no debugging output). Debugging messages are sent to stderr.
-lm file
Read the N-gram model from file.
-tolower
Map all vocabulary to lowercase. Useful if case conventions for N-best lists and language model differ.
-mix-lm file
Read a second, standard N-gram model for interpolation purposes.
-lambda weight
Set the weight of the main model when interpolating with -mix-lm. Default value is 0.5.
-bayes length
Interpolate the second and the main model using posterior probabilities for local N-gram-contexts of length length. The -lambda value is used as a prior mixture weight in this case.
-bayes-scale scale
Set the exponential scale factor on the context likelihood in conjunction with the -bayes function. Default value is 1.0.
-nbest-files list
Specifies a list of N-best files. The file list should contain a list of filenames, one per line, each corresponding to an N-best file in one of the formats supported by ngram(1). The N-best files should correspond to consecutive speech waveforms in the order listed.
-fb-rescore
Perform Forward-backward rescoring. This generates new N-best lists as output whose LM scores reflect the posterior probability of each hypothesis. The default is to perform Viterbi rescoring and output only the best combined hypothesis.
-max-nbest n
Limits the number of hypotheses read from each N-best list to the first n.
-max-rescore m
Only choose among the top m hypotheses of each list (after reordering hypotheses, see below). This is an effective way to limit the quadratic computation of the Viterbi or forward/backward dynamic programming.
-no-reorder
Do not reorder the hypotheses before limiting the computation to the top m. By default the hypotheses will first be sorted according to the acoustic and language model scores recorded in the N-best lists.
-rescore-lmw weight
Specifies the language model weight to be use in combining acoustic and language model scores to select the best hypotheses.
-rescore-wtw weight
Specifies the word transition weight to be used in selecting the best hypotheses.
-noise noise-tag
Designate noise-tag as a vocabulary item that is to be ignored by the LM. (This is typically used to identify a noise marker.)
-decipher-lm model-file
Designates the N-gram backoff model (typically a bigram) that was used by the Decipher(TM) recognizer in computing composite scores. Used to compute acoustic scores from the composite scores if the N-best lists are in "NBestList1.0" format.
-decipher-lmw weight
Specifies the language model weight used by the recognizer. Used to compute acoustic scores from the composite scores.
-decipher-wtw weight
Specifies the word transition weight used by the recognizer. Used to compute acoustic scores from the composite scores.
-stag string
Use string to mark segment boundaries in the output. Default is the start-of-sentence symbol defined in the language model (<s>).
-bias b
Make a segment boundary a priori more likely by a factor of b.

segment-nbest will also process any command line arguments following the options as lists of N-best lists, as with the -nbest-files option. Each nbest-file-list will be processed in turn, with individual output delimited by a line of the form
<nbestfile nbest-file-list>

SEE ALSO

ngram-count(1), ngram(1), segment(1).
A. Stolcke, ``Modeling Linguistic Segment and Turn Boundaries for N-best Rescoring of Spontaneous Speech,'' Proc. Eurospeech, 1997.

BUGS

Only N-grams models up to trigram order are used accurately.

AUTHOR

Andreas Stolcke <stolcke@speech.sri.com>.
Copyright 1997 SRI International