generation/logits_process
- generation/logits_process
- .LogitsProcessor
- .LogitsWarper
- .LogitsProcessorList
- .ForcedBOSTokenLogitsProcessor
- .ForcedEOSTokenLogitsProcessor
- .SuppressTokensAtBeginLogitsProcessor
- .WhisperTimeStampLogitsProcessor
- .NoRepeatNGramLogitsProcessor
new NoRepeatNGramLogitsProcessor(no_repeat_ngram_size)
.getNgrams(prevInputIds)
βMap.<string, Array<number>>
.getGeneratedNgrams(bannedNgrams, prevInputIds)
βArray.<number>
.calcBannedNgramTokens(prevInputIds)
βArray.<number>
._call(input_ids, logits)
βTensor
- .RepetitionPenaltyLogitsProcessor
- .MinLengthLogitsProcessor
- .MinNewTokensLengthLogitsProcessor
- .NoBadWordsLogitsProcessor
- .ClassifierFreeGuidanceLogitsProcessor
- .TemperatureLogitsWarper
- .TopPLogitsWarper
- .TopKLogitsWarper
generation/logits_process.LogitsProcessor
Abstract base class for all logit processors that can be applied during generation.
Kind: static class of generation/logits_process
logitsProcessor._call(input_ids, logits)
Apply the processor to the input logits.
Kind: instance abstract method of LogitsProcessor
Throws:
Error
Throws an error if `_call` is not implemented in the subclass.
Param | Type | Description |
---|---|---|
input_ids | Array.<Array<bigint>> | The input ids. |
logits | Tensor | The logits to process. |
generation/logits_process.LogitsWarper
Abstract base class for all logit warpers that can be applied during generation with multinomial sampling.
Kind: static class of generation/logits_process
logitsWarper._call(input_ids, logits)
Apply the processor to the input logits.
Kind: instance abstract method of LogitsWarper
Throws:
Error
Throws an error if `_call` is not implemented in the subclass.
Param | Type | Description |
---|---|---|
input_ids | Array.<Array<bigint>> | The input ids. |
logits | Tensor | The logits to process. |
generation/logits_process.LogitsProcessorList
A class representing a list of logits processors. A logits processor is a function that modifies the logits output of a language model. This class provides methods for adding new processors and applying all processors to a batch of logits.
Kind: static class of generation/logits_process
new LogitsProcessorList()
Constructs a new instance of LogitsProcessorList
.
logitsProcessorList.push(item)
Adds a new logits processor to the list.
Kind: instance method of LogitsProcessorList
Param | Type | Description |
---|---|---|
item | LogitsProcessor | The logits processor function to add. |
logitsProcessorList.extend(items)
Adds multiple logits processors to the list.
Kind: instance method of LogitsProcessorList
Param | Type | Description |
---|---|---|
items | Array.<LogitsProcessor> | The logits processor functions to add. |
logitsProcessorList._call(input_ids, logits)
Applies all logits processors in the list to a batch of logits, modifying them in-place.
Kind: instance method of LogitsProcessorList
Param | Type | Description |
---|---|---|
input_ids | Array.<Array<bigint>> | The input IDs for the language model. |
logits | Tensor |
generation/logits_process.ForcedBOSTokenLogitsProcessor
A LogitsProcessor that forces a BOS token at the beginning of the generated sequence.
Kind: static class of generation/logits_process
new ForcedBOSTokenLogitsProcessor(bos_token_id)
Create a ForcedBOSTokenLogitsProcessor.
Param | Type | Description |
---|---|---|
bos_token_id | number | The ID of the beginning-of-sequence token to be forced. |
forcedBOSTokenLogitsProcessor._call(input_ids, logits) β <code> Tensor </code>
Apply the BOS token forcing to the logits.
Kind: instance method of ForcedBOSTokenLogitsProcessor
Returns: Tensor
- The logits with BOS token forcing.
Param | Type | Description |
---|---|---|
input_ids | Array.<Array<bigint>> | The input IDs. |
logits | Tensor | The logits. |
generation/logits_process.ForcedEOSTokenLogitsProcessor
A logits processor that enforces the specified token as the last generated token when max_length
is reached.
Kind: static class of generation/logits_process
new ForcedEOSTokenLogitsProcessor(max_length, eos_token_id)
Create a ForcedEOSTokenLogitsProcessor.
Param | Type | Description |
---|---|---|
max_length | number | The maximum length of the sequence to be generated. |
eos_token_id | number | Array<number> | The id(s) of the end-of-sequence token. |
forcedEOSTokenLogitsProcessor._call(input_ids, logits)
Apply the processor to input_ids and logits.
Kind: instance method of ForcedEOSTokenLogitsProcessor
Param | Type | Description |
---|---|---|
input_ids | Array.<Array<bigint>> | The input ids. |
logits | Tensor | The logits tensor. |
generation/logits_process.SuppressTokensAtBeginLogitsProcessor
A LogitsProcessor that suppresses a list of tokens as soon as the generate
function starts
generating using begin_index
tokens. This should ensure that the tokens defined by
begin_suppress_tokens
at not sampled at the begining of the generation.
Kind: static class of generation/logits_process
new SuppressTokensAtBeginLogitsProcessor(begin_suppress_tokens, begin_index)
Create a SuppressTokensAtBeginLogitsProcessor.
Param | Type | Description |
---|---|---|
begin_suppress_tokens | Array.<number> | The IDs of the tokens to suppress. |
begin_index | number | The number of tokens to generate before suppressing tokens. |
suppressTokensAtBeginLogitsProcessor._call(input_ids, logits) β <code> Tensor </code>
Apply the BOS token forcing to the logits.
Kind: instance method of SuppressTokensAtBeginLogitsProcessor
Returns: Tensor
- The logits with BOS token forcing.
Param | Type | Description |
---|---|---|
input_ids | Array.<Array<bigint>> | The input IDs. |
logits | Tensor | The logits. |
generation/logits_process.WhisperTimeStampLogitsProcessor
A LogitsProcessor that handles adding timestamps to generated text.
Kind: static class of generation/logits_process
new WhisperTimeStampLogitsProcessor(generate_config, init_tokens)
Constructs a new WhisperTimeStampLogitsProcessor.
Param | Type | Description |
---|---|---|
generate_config | * | The config object passed to the |
init_tokens | Array.<number> | The initial tokens of the input sequence. |
whisperTimeStampLogitsProcessor._call(input_ids, logits) β <code> Tensor </code>
Modify the logits to handle timestamp tokens.
Kind: instance method of WhisperTimeStampLogitsProcessor
Returns: Tensor
- The modified logits.
Param | Type | Description |
---|---|---|
input_ids | Array.<Array<bigint>> | The input sequence of tokens. |
logits | Tensor | The logits output by the model. |
generation/logits_process.NoRepeatNGramLogitsProcessor
A logits processor that disallows ngrams of a certain size to be repeated.
Kind: static class of generation/logits_process
- .NoRepeatNGramLogitsProcessor
new NoRepeatNGramLogitsProcessor(no_repeat_ngram_size)
.getNgrams(prevInputIds)
βMap.<string, Array<number>>
.getGeneratedNgrams(bannedNgrams, prevInputIds)
βArray.<number>
.calcBannedNgramTokens(prevInputIds)
βArray.<number>
._call(input_ids, logits)
βTensor
new NoRepeatNGramLogitsProcessor(no_repeat_ngram_size)
Create a NoRepeatNGramLogitsProcessor.
Param | Type | Description |
---|---|---|
no_repeat_ngram_size | number | The no-repeat-ngram size. All ngrams of this size can only occur once. |
noRepeatNGramLogitsProcessor.getNgrams(prevInputIds) β <code> Map. < string, Array < number > > </code>
Generate n-grams from a sequence of token ids.
Kind: instance method of NoRepeatNGramLogitsProcessor
Returns: Map.<string, Array<number>>
- Map of generated n-grams
Param | Type | Description |
---|---|---|
prevInputIds | Array.<bigint> | List of previous input ids |
noRepeatNGramLogitsProcessor.getGeneratedNgrams(bannedNgrams, prevInputIds) β <code> Array. < number > </code>
Generate n-grams from a sequence of token ids.
Kind: instance method of NoRepeatNGramLogitsProcessor
Returns: Array.<number>
- Map of generated n-grams
Param | Type | Description |
---|---|---|
bannedNgrams | Map.<string, Array<number>> | Map of banned n-grams |
prevInputIds | Array.<bigint> | List of previous input ids |
noRepeatNGramLogitsProcessor.calcBannedNgramTokens(prevInputIds) β <code> Array. < number > </code>
Calculate banned n-gram tokens
Kind: instance method of NoRepeatNGramLogitsProcessor
Returns: Array.<number>
- Map of generated n-grams
Param | Type | Description |
---|---|---|
prevInputIds | Array.<bigint> | List of previous input ids |
noRepeatNGramLogitsProcessor._call(input_ids, logits) β <code> Tensor </code>
Apply the no-repeat-ngram processor to the logits.
Kind: instance method of NoRepeatNGramLogitsProcessor
Returns: Tensor
- The logits with no-repeat-ngram processing.
Param | Type | Description |
---|---|---|
input_ids | Array.<Array<bigint>> | The input IDs. |
logits | Tensor | The logits. |
generation/logits_process.RepetitionPenaltyLogitsProcessor
A logits processor that prevents the repetition of previous tokens through a penalty. This penalty is applied at most once per token. Note that, for decoder-only models like most LLMs, the considered tokens include the prompt.
In the original paper, the authors suggest the use of a
penalty of around 1.2 to achieve a good balance between truthful generation and lack of repetition.
To penalize and reduce repetition, use penalty
values above 1.0, where a higher value penalizes
more strongly. To reward and encourage repetition, use penalty
values between 0.0 and 1.0, where
a lower value rewards more strongly.
Kind: static class of generation/logits_process
new RepetitionPenaltyLogitsProcessor(penalty)
Create a RepetitionPenaltyLogitsProcessor.
Param | Type | Description |
---|---|---|
penalty | number | The parameter for repetition penalty.
|
repetitionPenaltyLogitsProcessor._call(input_ids, logits) β <code> Tensor </code>
Apply the repetition penalty to the logits.
Kind: instance method of RepetitionPenaltyLogitsProcessor
Returns: Tensor
- The logits with repetition penalty processing.
Param | Type | Description |
---|---|---|
input_ids | Array.<Array<bigint>> | The input IDs. |
logits | Tensor | The logits. |
generation/logits_process.MinLengthLogitsProcessor
A logits processor that enforces a minimum number of tokens.
Kind: static class of generation/logits_process
new MinLengthLogitsProcessor(min_length, eos_token_id)
Create a MinLengthLogitsProcessor.
Param | Type | Description |
---|---|---|
min_length | number | The minimum length below which the score of |
eos_token_id | number | Array<number> | The ID/IDs of the end-of-sequence token. |
minLengthLogitsProcessor._call(input_ids, logits) β <code> Tensor </code>
Apply logit processor.
Kind: instance method of MinLengthLogitsProcessor
Returns: Tensor
- The processed logits.
Param | Type | Description |
---|---|---|
input_ids | Array.<Array<bigint>> | The input IDs. |
logits | Tensor | The logits. |
generation/logits_process.MinNewTokensLengthLogitsProcessor
A logits processor that enforces a minimum number of new tokens.
Kind: static class of generation/logits_process
new MinNewTokensLengthLogitsProcessor(prompt_length_to_skip, min_new_tokens, eos_token_id)
Create a MinNewTokensLengthLogitsProcessor.
Param | Type | Description |
---|---|---|
prompt_length_to_skip | number | The input tokens length. |
min_new_tokens | number | The minimum new tokens length below which the score of |
eos_token_id | number | Array<number> | The ID/IDs of the end-of-sequence token. |
minNewTokensLengthLogitsProcessor._call(input_ids, logits) β <code> Tensor </code>
Apply logit processor.
Kind: instance method of MinNewTokensLengthLogitsProcessor
Returns: Tensor
- The processed logits.
Param | Type | Description |
---|---|---|
input_ids | Array.<Array<bigint>> | The input IDs. |
logits | Tensor | The logits. |
generation/logits_process.NoBadWordsLogitsProcessor
Kind: static class of generation/logits_process
new NoBadWordsLogitsProcessor(bad_words_ids, eos_token_id)
Create a NoBadWordsLogitsProcessor
.
Param | Type | Description |
---|---|---|
bad_words_ids | Array.<Array<number>> | List of list of token ids that are not allowed to be generated. |
eos_token_id | number | Array<number> | The id of the end-of-sequence token. Optionally, use a list to set multiple end-of-sequence tokens. |
noBadWordsLogitsProcessor._call(input_ids, logits) β <code> Tensor </code>
Apply logit processor.
Kind: instance method of NoBadWordsLogitsProcessor
Returns: Tensor
- The processed logits.
Param | Type | Description |
---|---|---|
input_ids | Array.<Array<bigint>> | The input IDs. |
logits | Tensor | The logits. |
generation/logits_process.ClassifierFreeGuidanceLogitsProcessor
[LogitsProcessor
] for classifier free guidance (CFG). The scores are split over the batch dimension,
where the first half correspond to the conditional logits (predicted from the input prompt) and the second half
correspond to the unconditional logits (predicted from an empty or βnullβ prompt). The processor computes a
weighted average across the conditional and unconditional logits, parameterised by the guidance_scale
.
See the paper for more information.
Kind: static class of generation/logits_process
new ClassifierFreeGuidanceLogitsProcessor(guidance_scale)
Create a ClassifierFreeGuidanceLogitsProcessor
.
Param | Type | Description |
---|---|---|
guidance_scale | number | The guidance scale for classifier free guidance (CFG). CFG is enabled by setting |
classifierFreeGuidanceLogitsProcessor._call(input_ids, logits) β <code> Tensor </code>
Apply logit processor.
Kind: instance method of ClassifierFreeGuidanceLogitsProcessor
Returns: Tensor
- The processed logits.
Param | Type | Description |
---|---|---|
input_ids | Array.<Array<bigint>> | The input IDs. |
logits | Tensor | The logits. |
generation/logits_process.TemperatureLogitsWarper
[LogitsWarper
] for temperature (exponential scaling output probability distribution), which effectively means
that it can control the randomness of the predicted tokens. Often used together with [TopPLogitsWarper
] and [TopKLogitsWarper
].
Kind: static class of generation/logits_process
new TemperatureLogitsWarper(temperature)
Create a TemperatureLogitsWarper
.
Param | Type | Description |
---|---|---|
temperature | number | Strictly positive float value used to modulate the logits distribution.
A value smaller than |
temperatureLogitsWarper._call(input_ids, logits) β <code> Tensor </code>
Apply logit warper.
Kind: instance method of TemperatureLogitsWarper
Returns: Tensor
- The processed logits.
Param | Type | Description |
---|---|---|
input_ids | Array.<Array<bigint>> | The input IDs. |
logits | Tensor | The logits. |
generation/logits_process.TopPLogitsWarper
[LogitsWarper
] that performs top-p, i.e. restricting to top tokens summing to prob_cut_off <= prob_cut_off.
Often used together with [TemperatureLogitsWarper
] and [TopKLogitsWarper
].
Kind: static class of generation/logits_process
new TopPLogitsWarper(top_p, options)
Create a TopPLogitsWarper
.
Param | Type | Default | Description |
---|---|---|---|
top_p | number | If set to < 1, only the smallest set of most probable tokens with
probabilities that add up to | |
options | Object | Additional options for the top-p sampling. | |
[options.filter_value] | number | -Infinity | All filtered values will be set to this float value. |
[options.min_tokens_to_keep] | number | 1 | Minimum number of tokens that cannot be filtered. |
generation/logits_process.TopKLogitsWarper
[LogitsWarper
] that performs top-k, i.e. restricting to the k highest probability elements.
Often used together with [TemperatureLogitsWarper
] and [TopPLogitsWarper
].
Kind: static class of generation/logits_process
new TopKLogitsWarper(top_k, options)
Create a TopKLogitsWarper
.
Param | Type | Default | Description |
---|---|---|---|
top_k | number | If set to > 0, only the top | |
options | Object | Additional options for the top-k sampling. | |
[options.filter_value] | number | -Infinity | All filtered values will be set to this float value. |
[options.min_tokens_to_keep] | number | 1 | Minimum number of tokens that cannot be filtered. |
< > Update on GitHub