Transformers.js documentation

models

Transformers.js

Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces

Faster examples with accelerated inference

Switch between documentation themes

to get started

models

Definitions of all models available in Transformers.js.

Example: Load and run an AutoModel.

import { AutoModel, AutoTokenizer } from '@xenova/transformers';

let tokenizer = await AutoTokenizer.from_pretrained('Xenova/bert-base-uncased');
let model = await AutoModel.from_pretrained('Xenova/bert-base-uncased');

let inputs = await tokenizer('I love transformers!');
let { logits } = await model(inputs);
// Tensor {
//     data: Float32Array(183132) [-7.117443084716797, -7.107812881469727, -7.092104911804199, ...]
//     dims: (3) [1, 6, 30522],
//     type: "float32",
//     size: 183132,
// }

We also provide other AutoModels (listed below), which you can use in the same way as the Python library. For example:

Example: Load and run an AutoModelForSeq2SeqLM.

import { AutoModelForSeq2SeqLM, AutoTokenizer } from '@xenova/transformers';

let tokenizer = await AutoTokenizer.from_pretrained('Xenova/t5-small');
let model = await AutoModelForSeq2SeqLM.from_pretrained('Xenova/t5-small');

let { input_ids } = await tokenizer('translate English to German: I love transformers!');
let outputs = await model.generate(input_ids);
let decoded = tokenizer.decode(outputs[0], { skip_special_tokens: true });
// 'Ich liebe Transformatoren!'

models
- static
  - .PreTrainedModel
    - new PreTrainedModel(config, session)
    - instance
      - .dispose() ⇒ Promise.<Array<unknown>>
      - ._call(model_inputs) ⇒ Promise.<Object>
      - .forward(model_inputs) ⇒ Promise.<Object>
      - ._get_generation_config(generation_config) ⇒ *
      - .groupBeams(beams) ⇒ Array
      - .getPastKeyValues(decoderResults, pastKeyValues) ⇒ Object
      - .getAttentions(decoderResults) ⇒ Object
      - .addPastKeyValues(decoderFeeds, pastKeyValues)
    - static
      - .from_pretrained(pretrained_model_name_or_path, options) ⇒ Promise.<PreTrainedModel>
  - .BaseModelOutput
    - new BaseModelOutput(output)
  - .BertForMaskedLM
    - ._call(model_inputs) ⇒ Promise.<MaskedLMOutput>
  - .BertForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .BertForTokenClassification
    - ._call(model_inputs) ⇒ Promise.<TokenClassifierOutput>
  - .BertForQuestionAnswering
    - ._call(model_inputs) ⇒ Promise.<QuestionAnsweringModelOutput>
  - .RoFormerModel
  - .RoFormerForMaskedLM
    - ._call(model_inputs) ⇒ Promise.<MaskedLMOutput>
  - .RoFormerForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .RoFormerForTokenClassification
    - ._call(model_inputs) ⇒ Promise.<TokenClassifierOutput>
  - .RoFormerForQuestionAnswering
    - ._call(model_inputs) ⇒ Promise.<QuestionAnsweringModelOutput>
  - .ConvBertModel
  - .ConvBertForMaskedLM
    - ._call(model_inputs) ⇒ Promise.<MaskedLMOutput>
  - .ConvBertForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .ConvBertForTokenClassification
    - ._call(model_inputs) ⇒ Promise.<TokenClassifierOutput>
  - .ConvBertForQuestionAnswering
    - ._call(model_inputs) ⇒ Promise.<QuestionAnsweringModelOutput>
  - .ElectraModel
  - .ElectraForMaskedLM
    - ._call(model_inputs) ⇒ Promise.<MaskedLMOutput>
  - .ElectraForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .ElectraForTokenClassification
    - ._call(model_inputs) ⇒ Promise.<TokenClassifierOutput>
  - .ElectraForQuestionAnswering
    - ._call(model_inputs) ⇒ Promise.<QuestionAnsweringModelOutput>
  - .CamembertModel
  - .CamembertForMaskedLM
    - ._call(model_inputs) ⇒ Promise.<MaskedLMOutput>
  - .CamembertForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .CamembertForTokenClassification
    - ._call(model_inputs) ⇒ Promise.<TokenClassifierOutput>
  - .CamembertForQuestionAnswering
    - ._call(model_inputs) ⇒ Promise.<QuestionAnsweringModelOutput>
  - .DebertaModel
  - .DebertaForMaskedLM
    - ._call(model_inputs) ⇒ Promise.<MaskedLMOutput>
  - .DebertaForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .DebertaForTokenClassification
    - ._call(model_inputs) ⇒ Promise.<TokenClassifierOutput>
  - .DebertaForQuestionAnswering
    - ._call(model_inputs) ⇒ Promise.<QuestionAnsweringModelOutput>
  - .DebertaV2Model
  - .DebertaV2ForMaskedLM
    - ._call(model_inputs) ⇒ Promise.<MaskedLMOutput>
  - .DebertaV2ForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .DebertaV2ForTokenClassification
    - ._call(model_inputs) ⇒ Promise.<TokenClassifierOutput>
  - .DebertaV2ForQuestionAnswering
    - ._call(model_inputs) ⇒ Promise.<QuestionAnsweringModelOutput>
  - .DistilBertForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .DistilBertForTokenClassification
    - ._call(model_inputs) ⇒ Promise.<TokenClassifierOutput>
  - .DistilBertForQuestionAnswering
    - ._call(model_inputs) ⇒ Promise.<QuestionAnsweringModelOutput>
  - .DistilBertForMaskedLM
    - ._call(model_inputs) ⇒ Promise.<MaskedLMOutput>
  - .EsmModel
  - .EsmForMaskedLM
    - ._call(model_inputs) ⇒ Promise.<MaskedLMOutput>
  - .EsmForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .EsmForTokenClassification
    - ._call(model_inputs) ⇒ Promise.<TokenClassifierOutput>
  - .MobileBertForMaskedLM
    - ._call(model_inputs) ⇒ Promise.<MaskedLMOutput>
  - .MobileBertForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .MobileBertForQuestionAnswering
    - ._call(model_inputs) ⇒ Promise.<QuestionAnsweringModelOutput>
  - .MPNetModel
  - .MPNetForMaskedLM
    - ._call(model_inputs) ⇒ Promise.<MaskedLMOutput>
  - .MPNetForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .MPNetForTokenClassification
    - ._call(model_inputs) ⇒ Promise.<TokenClassifierOutput>
  - .MPNetForQuestionAnswering
    - ._call(model_inputs) ⇒ Promise.<QuestionAnsweringModelOutput>
  - .T5ForConditionalGeneration
    - new T5ForConditionalGeneration(config, session, decoder_merged_session, generation_config)
  - .LongT5PreTrainedModel
  - .LongT5Model
  - .LongT5ForConditionalGeneration
    - new LongT5ForConditionalGeneration(config, session, decoder_merged_session, generation_config)
  - .MT5ForConditionalGeneration
    - new MT5ForConditionalGeneration(config, session, decoder_merged_session, generation_config)
  - .BartModel
  - .BartForConditionalGeneration
    - new BartForConditionalGeneration(config, session, decoder_merged_session, generation_config)
  - .BartForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .MBartModel
  - .MBartForConditionalGeneration
    - new MBartForConditionalGeneration(config, session, decoder_merged_session, generation_config)
  - .MBartForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .MBartForCausalLM
    - new MBartForCausalLM(config, decoder_merged_session, generation_config)
  - .BlenderbotModel
  - .BlenderbotForConditionalGeneration
    - new BlenderbotForConditionalGeneration(config, session, decoder_merged_session, generation_config)
  - .BlenderbotSmallModel
  - .BlenderbotSmallForConditionalGeneration
    - new BlenderbotSmallForConditionalGeneration(config, session, decoder_merged_session, generation_config)
  - .RobertaForMaskedLM
    - ._call(model_inputs) ⇒ Promise.<MaskedLMOutput>
  - .RobertaForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .RobertaForTokenClassification
    - ._call(model_inputs) ⇒ Promise.<TokenClassifierOutput>
  - .RobertaForQuestionAnswering
    - ._call(model_inputs) ⇒ Promise.<QuestionAnsweringModelOutput>
  - .XLMPreTrainedModel
  - .XLMModel
  - .XLMWithLMHeadModel
    - ._call(model_inputs) ⇒ Promise.<MaskedLMOutput>
  - .XLMForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .XLMForTokenClassification
    - ._call(model_inputs) ⇒ Promise.<TokenClassifierOutput>
  - .XLMForQuestionAnswering
    - ._call(model_inputs) ⇒ Promise.<QuestionAnsweringModelOutput>
  - .XLMRobertaForMaskedLM
    - ._call(model_inputs) ⇒ Promise.<MaskedLMOutput>
  - .XLMRobertaForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .XLMRobertaForTokenClassification
    - ._call(model_inputs) ⇒ Promise.<TokenClassifierOutput>
  - .XLMRobertaForQuestionAnswering
    - ._call(model_inputs) ⇒ Promise.<QuestionAnsweringModelOutput>
  - .ASTModel
  - .ASTForAudioClassification
  - .WhisperModel
  - .WhisperForConditionalGeneration
    - new WhisperForConditionalGeneration(config, session, decoder_merged_session, generation_config)
    - .generate(inputs, generation_config, logits_processor) ⇒ Promise.<Object>
    - ._extract_token_timestamps(generate_outputs, alignment_heads, [num_frames], [time_precision]) ⇒ Tensor
  - .VisionEncoderDecoderModel
    - new VisionEncoderDecoderModel(config, session, decoder_merged_session, generation_config)
  - .CLIPModel
  - .CLIPTextModelWithProjection
    - .from_pretrained() : PreTrainedModel.from_pretrained
  - .CLIPVisionModelWithProjection
    - .from_pretrained() : PreTrainedModel.from_pretrained
  - .SiglipModel
  - .SiglipTextModel
    - .from_pretrained() : PreTrainedModel.from_pretrained
  - .SiglipVisionModel
    - .from_pretrained() : PreTrainedModel.from_pretrained
  - .CLIPSegForImageSegmentation
  - .GPT2PreTrainedModel
    - new GPT2PreTrainedModel(config, session, generation_config)
  - .GPT2LMHeadModel
  - .GPTNeoPreTrainedModel
    - new GPTNeoPreTrainedModel(config, session, generation_config)
  - .GPTNeoXPreTrainedModel
    - new GPTNeoXPreTrainedModel(config, session, generation_config)
  - .GPTJPreTrainedModel
    - new GPTJPreTrainedModel(config, session, generation_config)
  - .GPTBigCodePreTrainedModel
    - new GPTBigCodePreTrainedModel(config, session, generation_config)
  - .CodeGenPreTrainedModel
    - new CodeGenPreTrainedModel(config, session, generation_config)
  - .CodeGenModel
  - .CodeGenForCausalLM
  - .LlamaPreTrainedModel
    - new LlamaPreTrainedModel(config, session, generation_config)
  - .LlamaModel
  - .Qwen2PreTrainedModel
    - new Qwen2PreTrainedModel(config, session, generation_config)
  - .Qwen2Model
  - .PhiPreTrainedModel
    - new PhiPreTrainedModel(config, session, generation_config)
  - .PhiModel
  - .BloomPreTrainedModel
    - new BloomPreTrainedModel(config, session, generation_config)
  - .BloomModel
  - .BloomForCausalLM
  - .MptPreTrainedModel
    - new MptPreTrainedModel(config, session, generation_config)
  - .MptModel
  - .MptForCausalLM
  - .OPTPreTrainedModel
    - new OPTPreTrainedModel(config, session, generation_config)
  - .OPTModel
  - .OPTForCausalLM
  - .VitMatteForImageMatting
    - ._call(model_inputs)
  - .DetrObjectDetectionOutput
    - new DetrObjectDetectionOutput(output)
  - .DetrSegmentationOutput
    - new DetrSegmentationOutput(output)
  - .TableTransformerModel
  - .TableTransformerForObjectDetection
    - ._call(model_inputs)
  - .ResNetPreTrainedModel
  - .ResNetModel
  - .ResNetForImageClassification
    - ._call(model_inputs)
  - .Swin2SRModel
  - .Swin2SRForImageSuperResolution
  - .DPTModel
  - .DPTForDepthEstimation
  - .DepthAnythingForDepthEstimation
  - .GLPNModel
  - .GLPNForDepthEstimation
  - .DonutSwinModel
  - .ConvNextModel
  - .ConvNextForImageClassification
    - ._call(model_inputs)
  - .ConvNextV2Model
  - .ConvNextV2ForImageClassification
    - ._call(model_inputs)
  - .Dinov2Model
  - .Dinov2ForImageClassification
    - ._call(model_inputs)
  - .YolosObjectDetectionOutput
    - new YolosObjectDetectionOutput(output)
  - .SamModel
    - new SamModel(config, vision_encoder, prompt_encoder_mask_decoder)
    - .get_image_embeddings(model_inputs) ⇒ Promise.<{image_embeddings: Tensor, image_positional_embeddings: Tensor}>
    - .forward(model_inputs) ⇒ Promise.<Object>
    - ._call(model_inputs) ⇒ Promise.<SamImageSegmentationOutput>
  - .SamImageSegmentationOutput
    - new SamImageSegmentationOutput(output)
  - .MarianMTModel
    - new MarianMTModel(config, session, decoder_merged_session, generation_config)
  - .M2M100ForConditionalGeneration
    - new M2M100ForConditionalGeneration(config, session, decoder_merged_session, generation_config)
  - .Wav2Vec2Model
  - .Wav2Vec2ForAudioFrameClassification
    - ._call(model_inputs) ⇒ Promise.<TokenClassifierOutput>
  - .UniSpeechModel
  - .UniSpeechForCTC
    - ._call(model_inputs)
  - .UniSpeechForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .UniSpeechSatModel
  - .UniSpeechSatForCTC
    - ._call(model_inputs)
  - .UniSpeechSatForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .UniSpeechSatForAudioFrameClassification
    - ._call(model_inputs) ⇒ Promise.<TokenClassifierOutput>
  - .Wav2Vec2BertModel
  - .Wav2Vec2BertForCTC
    - ._call(model_inputs)
  - .Wav2Vec2BertForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .HubertModel
  - .HubertForCTC
    - ._call(model_inputs)
  - .HubertForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .WavLMPreTrainedModel
  - .WavLMModel
  - .WavLMForCTC
    - ._call(model_inputs)
  - .WavLMForSequenceClassification
    - ._call(model_inputs) ⇒ Promise.<SequenceClassifierOutput>
  - .WavLMForXVector
    - ._call(model_inputs) ⇒ Promise.<XVectorOutput>
  - .WavLMForAudioFrameClassification
    - ._call(model_inputs) ⇒ Promise.<TokenClassifierOutput>
  - .SpeechT5PreTrainedModel
  - .SpeechT5Model
  - .SpeechT5ForSpeechToText
  - .SpeechT5ForTextToSpeech
    - new SpeechT5ForTextToSpeech(config, session, decoder_merged_session, generation_config)
    - .generate_speech(input_values, speaker_embeddings, options) ⇒ Promise.<SpeechOutput>
  - .SpeechT5HifiGan
  - .TrOCRPreTrainedModel
    - new TrOCRPreTrainedModel(config, session, generation_config)
  - .TrOCRForCausalLM
  - .MistralPreTrainedModel
    - new MistralPreTrainedModel(config, session, generation_config)
  - .Starcoder2PreTrainedModel
    - new Starcoder2PreTrainedModel(config, session, generation_config)
  - .FalconPreTrainedModel
    - new FalconPreTrainedModel(config, session, generation_config)
  - .ClapTextModelWithProjection
    - .from_pretrained() : PreTrainedModel.from_pretrained
  - .ClapAudioModelWithProjection
    - .from_pretrained() : PreTrainedModel.from_pretrained
  - .VitsModel
    - ._call(model_inputs) ⇒ Promise.<VitsModelOutput>
  - .SegformerModel
  - .SegformerForImageClassification
  - .SegformerForSemanticSegmentation
  - .StableLmPreTrainedModel
    - new StableLmPreTrainedModel(config, session, generation_config)
  - .StableLmModel
  - .StableLmForCausalLM
  - .EfficientNetModel
  - .EfficientNetForImageClassification
    - ._call(model_inputs)
  - .PretrainedMixin
    - instance
      - .MODEL_CLASS_MAPPINGS : *
      - .BASE_IF_FAIL
    - static
      - .from_pretrained() : PreTrainedModel.from_pretrained
  - .AutoModel
    - .MODEL_CLASS_MAPPINGS : *
  - .AutoModelForSequenceClassification
  - .AutoModelForTokenClassification
  - .AutoModelForSeq2SeqLM
  - .AutoModelForSpeechSeq2Seq
  - .AutoModelForTextToSpectrogram
  - .AutoModelForTextToWaveform
  - .AutoModelForCausalLM
  - .AutoModelForMaskedLM
  - .AutoModelForQuestionAnswering
  - .AutoModelForVision2Seq
  - .AutoModelForImageClassification
  - .AutoModelForImageSegmentation
  - .AutoModelForSemanticSegmentation
  - .AutoModelForObjectDetection
  - .AutoModelForMaskGeneration
  - .Seq2SeqLMOutput
    - new Seq2SeqLMOutput(output)
  - .SequenceClassifierOutput
    - new SequenceClassifierOutput(output)
  - .XVectorOutput
    - new XVectorOutput(output)
  - .TokenClassifierOutput
    - new TokenClassifierOutput(output)
  - .MaskedLMOutput
    - new MaskedLMOutput(output)
  - .QuestionAnsweringModelOutput
    - new QuestionAnsweringModelOutput(output)
  - .CausalLMOutput
    - new CausalLMOutput(output)
  - .CausalLMOutputWithPast
    - new CausalLMOutputWithPast(output)
  - .ImageMattingOutput
    - new ImageMattingOutput(output)
  - .VitsModelOutput
    - new VitsModelOutput(output)
- inner
  - ~InferenceSession : *
  - ~TypedArray : *
  - ~DecoderOutput ⇒ Promise.<(Array<Array<number>>|EncoderDecoderOutput|DecoderOutput)>
  - ~WhisperGenerationConfig : Object
  - ~SamModelInputs : Object
  - ~SpeechOutput : Object

models.PreTrainedModel

A base class for pre-trained models that provides the model configuration and an ONNX session.

Kind: static class of models

.PreTrainedModel
- new PreTrainedModel(config, session)
- instance
  - .dispose() ⇒ Promise.<Array<unknown>>
  - ._call(model_inputs) ⇒ Promise.<Object>
  - .forward(model_inputs) ⇒ Promise.<Object>
  - ._get_generation_config(generation_config) ⇒ *
  - .groupBeams(beams) ⇒ Array
  - .getPastKeyValues(decoderResults, pastKeyValues) ⇒ Object
  - .getAttentions(decoderResults) ⇒ Object
  - .addPastKeyValues(decoderFeeds, pastKeyValues)
- static
  - .from_pretrained(pretrained_model_name_or_path, options) ⇒ Promise.<PreTrainedModel>

new PreTrainedModel(config, session)

Creates a new instance of the PreTrainedModel class.

Param	Type	Description
config	`Object`	The model configuration.
session	`any`	session for the model.

preTrainedModel.dispose() ⇒ <code> Promise. < Array < unknown > > </code>

Disposes of all the ONNX sessions that were created during inference.

Kind: instance method of PreTrainedModel
Returns: Promise.<Array<unknown>> - An array of promises, one for each ONNX session that is being disposed.
Todo

Use https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/FinalizationRegistry

preTrainedModel._call(model_inputs) ⇒ <code> Promise. < Object > </code>

Runs the model with the provided inputs

Kind: instance method of PreTrainedModel
Returns: Promise.<Object> - Object containing output tensors

Param	Type	Description
model_inputs	`Object`	Object containing input tensors

preTrainedModel.forward(model_inputs) ⇒ <code> Promise. < Object > </code>

Forward method for a pretrained model. If not overridden by a subclass, the correct forward method will be chosen based on the model type.

Kind: instance method of PreTrainedModel
Returns: Promise.<Object> - The output data from the model in the format specified in the ONNX model.
Throws:

Error This method must be implemented in subclasses.

Param	Type	Description
model_inputs	`Object`	The input data to the model in the format specified in the ONNX model.

preTrainedModel._get_generation_config(generation_config) ⇒ <code> * </code>

This function merges multiple generation configs together to form a final generation config to be used by the model for text generation. It first creates an empty GenerationConfig object, then it applies the model’s own generation_config property to it. Finally, if a generation_config object was passed in the arguments, it overwrites the corresponding properties in the final config with those of the passed config object.

Kind: instance method of PreTrainedModel
Returns: * - The final generation config object to be used by the model for text generation.

Param	Type	Description
generation_config	`*`	A `GenerationConfig` object containing generation parameters.

preTrainedModel.groupBeams(beams) ⇒ <code> Array </code>

Groups an array of beam objects by their ids.

Kind: instance method of PreTrainedModel
Returns: Array - An array of arrays, where each inner array contains beam objects with the same id.

Param	Type	Description
beams	`Array`	The array of beam objects to group.

preTrainedModel.getPastKeyValues(decoderResults, pastKeyValues) ⇒ <code> Object </code>

Returns an object containing past key values from the given decoder results object.

Kind: instance method of PreTrainedModel
Returns: Object - An object containing past key values.

Param	Type	Description
decoderResults	`Object`	The decoder results object.
pastKeyValues	`Object`	The previous past key values.

preTrainedModel.getAttentions(decoderResults) ⇒ <code> Object </code>

Returns an object containing attentions from the given decoder results object.

Kind: instance method of PreTrainedModel
Returns: Object - An object containing attentions.

Param	Type	Description
decoderResults	`Object`	The decoder results object.

preTrainedModel.addPastKeyValues(decoderFeeds, pastKeyValues)

Adds past key values to the decoder feeds object. If pastKeyValues is null, creates new tensors for past key values.

Kind: instance method of PreTrainedModel

Param	Type	Description
decoderFeeds	`Object`	The decoder feeds object to add past key values to.
pastKeyValues	`Object`	An object containing past key values.

PreTrainedModel.from_pretrained(pretrained_model_name_or_path, options) ⇒ <code> Promise. < PreTrainedModel > </code>

Instantiate one of the model classes of the library from a pretrained model.

The model class to instantiate is selected based on the model_type property of the config object (either passed as an argument or loaded from pretrained_model_name_or_path if possible)

Kind: static method of PreTrainedModel
Returns: Promise.<PreTrainedModel> - A new instance of the PreTrainedModel class.

Param Type Description

pretrained_model_name_or_path

Param	Type	Description
pretrained_model_name_or_path	`string`	The name or path of the pretrained model. Can be either: A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like `bert-base-uncased`, or namespaced under a user or organization name, like `dbmdz/bert-base-german-cased`. A path to a directory containing model weights, e.g., `./my_model_directory/`.
options	`*`	Additional options for loading the model.

string

The name or path of the pretrained model. Can be either:

A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
A path to a directory containing model weights, e.g., ./my_model_directory/.

options

*

Additional options for loading the model.

models.BaseModelOutput

Base class for model’s outputs, with potential hidden states and attentions.

Kind: static class of models

new BaseModelOutput(output)

Param	Type	Description
output	`Object`	The output of the model.
output.last_hidden_state	`Tensor`	Sequence of hidden-states at the output of the last layer of the model.
[output.hidden_states]	`Tensor`	Hidden-states of the model at the output of each layer plus the optional initial embedding outputs.
[output.attentions]	`Tensor`	Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.

models.BertForMaskedLM

BertForMaskedLM is a class representing a BERT model for masked language modeling.

Kind: static class of models

bertForMaskedLM._call(model_inputs) ⇒ <code> Promise. < MaskedLMOutput > </code>

Calls the model on new inputs.

Kind: instance method of BertForMaskedLM
Returns: Promise.<MaskedLMOutput> - An object containing the model’s output logits for masked language modeling.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.BertForSequenceClassification

BertForSequenceClassification is a class representing a BERT model for sequence classification.

Kind: static class of models

bertForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of BertForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - An object containing the model’s output logits for sequence classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.BertForTokenClassification

BertForTokenClassification is a class representing a BERT model for token classification.

Kind: static class of models

bertForTokenClassification._call(model_inputs) ⇒ <code> Promise. < TokenClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of BertForTokenClassification
Returns: Promise.<TokenClassifierOutput> - An object containing the model’s output logits for token classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.BertForQuestionAnswering

BertForQuestionAnswering is a class representing a BERT model for question answering.

Kind: static class of models

bertForQuestionAnswering._call(model_inputs) ⇒ <code> Promise. < QuestionAnsweringModelOutput > </code>

Calls the model on new inputs.

Kind: instance method of BertForQuestionAnswering
Returns: Promise.<QuestionAnsweringModelOutput> - An object containing the model’s output logits for question answering.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.RoFormerModel

The bare RoFormer Model transformer outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.RoFormerForMaskedLM

RoFormer Model with a language modeling head on top.

Kind: static class of models

roFormerForMaskedLM._call(model_inputs) ⇒ <code> Promise. < MaskedLMOutput > </code>

Calls the model on new inputs.

Kind: instance method of RoFormerForMaskedLM
Returns: Promise.<MaskedLMOutput> - An object containing the model’s output logits for masked language modeling.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.RoFormerForSequenceClassification

RoFormer Model transformer with a sequence classification/regression head on top (a linear layer on top of the pooled output)

Kind: static class of models

roFormerForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of RoFormerForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - An object containing the model’s output logits for sequence classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.RoFormerForTokenClassification

RoFormer Model with a token classification head on top (a linear layer on top of the hidden-states output) e.g. for Named-Entity-Recognition (NER) tasks.

Kind: static class of models

roFormerForTokenClassification._call(model_inputs) ⇒ <code> Promise. < TokenClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of RoFormerForTokenClassification
Returns: Promise.<TokenClassifierOutput> - An object containing the model’s output logits for token classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.RoFormerForQuestionAnswering

RoFormer Model with a span classification head on top for extractive question-answering tasks like SQuAD (a linear layers on top of the hidden-states output to compute span start logits and span end logits).

Kind: static class of models

roFormerForQuestionAnswering._call(model_inputs) ⇒ <code> Promise. < QuestionAnsweringModelOutput > </code>

Calls the model on new inputs.

Kind: instance method of RoFormerForQuestionAnswering
Returns: Promise.<QuestionAnsweringModelOutput> - An object containing the model’s output logits for question answering.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.ConvBertModel

The bare ConvBERT Model transformer outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.ConvBertForMaskedLM

ConvBERT Model with a language modeling head on top.

Kind: static class of models

convBertForMaskedLM._call(model_inputs) ⇒ <code> Promise. < MaskedLMOutput > </code>

Calls the model on new inputs.

Kind: instance method of ConvBertForMaskedLM
Returns: Promise.<MaskedLMOutput> - An object containing the model’s output logits for masked language modeling.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.ConvBertForSequenceClassification

ConvBERT Model transformer with a sequence classification/regression head on top (a linear layer on top of the pooled output)

Kind: static class of models

convBertForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of ConvBertForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - An object containing the model’s output logits for sequence classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.ConvBertForTokenClassification

ConvBERT Model with a token classification head on top (a linear layer on top of the hidden-states output) e.g. for Named-Entity-Recognition (NER) tasks.

Kind: static class of models

convBertForTokenClassification._call(model_inputs) ⇒ <code> Promise. < TokenClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of ConvBertForTokenClassification
Returns: Promise.<TokenClassifierOutput> - An object containing the model’s output logits for token classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.ConvBertForQuestionAnswering

ConvBERT Model with a span classification head on top for extractive question-answering tasks like SQuAD (a linear layers on top of the hidden-states output to compute span start logits and span end logits)

Kind: static class of models

convBertForQuestionAnswering._call(model_inputs) ⇒ <code> Promise. < QuestionAnsweringModelOutput > </code>

Calls the model on new inputs.

Kind: instance method of ConvBertForQuestionAnswering
Returns: Promise.<QuestionAnsweringModelOutput> - An object containing the model’s output logits for question answering.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.ElectraModel

The bare Electra Model transformer outputting raw hidden-states without any specific head on top. Identical to the BERT model except that it uses an additional linear layer between the embedding layer and the encoder if the hidden size and embedding size are different.

Kind: static class of models

models.ElectraForMaskedLM

Electra model with a language modeling head on top.

Kind: static class of models

electraForMaskedLM._call(model_inputs) ⇒ <code> Promise. < MaskedLMOutput > </code>

Calls the model on new inputs.

Kind: instance method of ElectraForMaskedLM
Returns: Promise.<MaskedLMOutput> - An object containing the model’s output logits for masked language modeling.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.ElectraForSequenceClassification

ELECTRA Model transformer with a sequence classification/regression head on top (a linear layer on top of the pooled output)

Kind: static class of models

electraForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of ElectraForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - An object containing the model’s output logits for sequence classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.ElectraForTokenClassification

Electra model with a token classification head on top.

Kind: static class of models

electraForTokenClassification._call(model_inputs) ⇒ <code> Promise. < TokenClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of ElectraForTokenClassification
Returns: Promise.<TokenClassifierOutput> - An object containing the model’s output logits for token classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.ElectraForQuestionAnswering

LECTRA Model with a span classification head on top for extractive question-answering tasks like SQuAD (a linear layers on top of the hidden-states output to compute span start logits and span end logits).

Kind: static class of models

electraForQuestionAnswering._call(model_inputs) ⇒ <code> Promise. < QuestionAnsweringModelOutput > </code>

Calls the model on new inputs.

Kind: instance method of ElectraForQuestionAnswering
Returns: Promise.<QuestionAnsweringModelOutput> - An object containing the model’s output logits for question answering.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.CamembertModel

The bare CamemBERT Model transformer outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.CamembertForMaskedLM

CamemBERT Model with a language modeling head on top.

Kind: static class of models

camembertForMaskedLM._call(model_inputs) ⇒ <code> Promise. < MaskedLMOutput > </code>

Calls the model on new inputs.

Kind: instance method of CamembertForMaskedLM
Returns: Promise.<MaskedLMOutput> - An object containing the model’s output logits for masked language modeling.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.CamembertForSequenceClassification

CamemBERT Model transformer with a sequence classification/regression head on top (a linear layer on top of the pooled output) e.g. for GLUE tasks.

Kind: static class of models

camembertForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of CamembertForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - An object containing the model’s output logits for sequence classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.CamembertForTokenClassification

CamemBERT Model with a token classification head on top (a linear layer on top of the hidden-states output) e.g. for Named-Entity-Recognition (NER) tasks.

Kind: static class of models

camembertForTokenClassification._call(model_inputs) ⇒ <code> Promise. < TokenClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of CamembertForTokenClassification
Returns: Promise.<TokenClassifierOutput> - An object containing the model’s output logits for token classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.CamembertForQuestionAnswering

CamemBERT Model with a span classification head on top for extractive question-answering tasks

Kind: static class of models

camembertForQuestionAnswering._call(model_inputs) ⇒ <code> Promise. < QuestionAnsweringModelOutput > </code>

Calls the model on new inputs.

Kind: instance method of CamembertForQuestionAnswering
Returns: Promise.<QuestionAnsweringModelOutput> - An object containing the model’s output logits for question answering.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.DebertaModel

The bare DeBERTa Model transformer outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.DebertaForMaskedLM

DeBERTa Model with a language modeling head on top.

Kind: static class of models

debertaForMaskedLM._call(model_inputs) ⇒ <code> Promise. < MaskedLMOutput > </code>

Calls the model on new inputs.

Kind: instance method of DebertaForMaskedLM
Returns: Promise.<MaskedLMOutput> - An object containing the model’s output logits for masked language modeling.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.DebertaForSequenceClassification

DeBERTa Model transformer with a sequence classification/regression head on top (a linear layer on top of the pooled output)

Kind: static class of models

debertaForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of DebertaForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - An object containing the model’s output logits for sequence classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.DebertaForTokenClassification

DeBERTa Model with a token classification head on top (a linear layer on top of the hidden-states output) e.g. for Named-Entity-Recognition (NER) tasks.

Kind: static class of models

debertaForTokenClassification._call(model_inputs) ⇒ <code> Promise. < TokenClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of DebertaForTokenClassification
Returns: Promise.<TokenClassifierOutput> - An object containing the model’s output logits for token classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.DebertaForQuestionAnswering

DeBERTa Model with a span classification head on top for extractive question-answering tasks like SQuAD (a linear layers on top of the hidden-states output to compute span start logits and span end logits).

Kind: static class of models

debertaForQuestionAnswering._call(model_inputs) ⇒ <code> Promise. < QuestionAnsweringModelOutput > </code>

Calls the model on new inputs.

Kind: instance method of DebertaForQuestionAnswering
Returns: Promise.<QuestionAnsweringModelOutput> - An object containing the model’s output logits for question answering.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.DebertaV2Model

The bare DeBERTa-V2 Model transformer outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.DebertaV2ForMaskedLM

DeBERTa-V2 Model with a language modeling head on top.

Kind: static class of models

debertaV2ForMaskedLM._call(model_inputs) ⇒ <code> Promise. < MaskedLMOutput > </code>

Calls the model on new inputs.

Kind: instance method of DebertaV2ForMaskedLM
Returns: Promise.<MaskedLMOutput> - An object containing the model’s output logits for masked language modeling.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.DebertaV2ForSequenceClassification

DeBERTa-V2 Model transformer with a sequence classification/regression head on top (a linear layer on top of the pooled output)

Kind: static class of models

debertaV2ForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of DebertaV2ForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - An object containing the model’s output logits for sequence classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.DebertaV2ForTokenClassification

DeBERTa-V2 Model with a token classification head on top (a linear layer on top of the hidden-states output) e.g. for Named-Entity-Recognition (NER) tasks.

Kind: static class of models

debertaV2ForTokenClassification._call(model_inputs) ⇒ <code> Promise. < TokenClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of DebertaV2ForTokenClassification
Returns: Promise.<TokenClassifierOutput> - An object containing the model’s output logits for token classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.DebertaV2ForQuestionAnswering

DeBERTa-V2 Model with a span classification head on top for extractive question-answering tasks like SQuAD (a linear layers on top of the hidden-states output to compute span start logits and span end logits).

Kind: static class of models

debertaV2ForQuestionAnswering._call(model_inputs) ⇒ <code> Promise. < QuestionAnsweringModelOutput > </code>

Calls the model on new inputs.

Kind: instance method of DebertaV2ForQuestionAnswering
Returns: Promise.<QuestionAnsweringModelOutput> - An object containing the model’s output logits for question answering.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.DistilBertForSequenceClassification

DistilBertForSequenceClassification is a class representing a DistilBERT model for sequence classification.

Kind: static class of models

distilBertForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of DistilBertForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - An object containing the model’s output logits for sequence classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.DistilBertForTokenClassification

DistilBertForTokenClassification is a class representing a DistilBERT model for token classification.

Kind: static class of models

distilBertForTokenClassification._call(model_inputs) ⇒ <code> Promise. < TokenClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of DistilBertForTokenClassification
Returns: Promise.<TokenClassifierOutput> - An object containing the model’s output logits for token classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.DistilBertForQuestionAnswering

DistilBertForQuestionAnswering is a class representing a DistilBERT model for question answering.

Kind: static class of models

distilBertForQuestionAnswering._call(model_inputs) ⇒ <code> Promise. < QuestionAnsweringModelOutput > </code>

Calls the model on new inputs.

Kind: instance method of DistilBertForQuestionAnswering
Returns: Promise.<QuestionAnsweringModelOutput> - An object containing the model’s output logits for question answering.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.DistilBertForMaskedLM

DistilBertForMaskedLM is a class representing a DistilBERT model for masking task.

Kind: static class of models

distilBertForMaskedLM._call(model_inputs) ⇒ <code> Promise. < MaskedLMOutput > </code>

Calls the model on new inputs.

Kind: instance method of DistilBertForMaskedLM
Returns: Promise.<MaskedLMOutput> - returned object

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.EsmModel

The bare ESM Model transformer outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.EsmForMaskedLM

ESM Model with a language modeling head on top.

Kind: static class of models

esmForMaskedLM._call(model_inputs) ⇒ <code> Promise. < MaskedLMOutput > </code>

Calls the model on new inputs.

Kind: instance method of EsmForMaskedLM
Returns: Promise.<MaskedLMOutput> - An object containing the model’s output logits for masked language modeling.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.EsmForSequenceClassification

ESM Model transformer with a sequence classification/regression head on top (a linear layer on top of the pooled output)

Kind: static class of models

esmForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of EsmForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - An object containing the model’s output logits for sequence classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.EsmForTokenClassification

ESM Model with a token classification head on top (a linear layer on top of the hidden-states output) e.g. for Named-Entity-Recognition (NER) tasks.

Kind: static class of models

esmForTokenClassification._call(model_inputs) ⇒ <code> Promise. < TokenClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of EsmForTokenClassification
Returns: Promise.<TokenClassifierOutput> - An object containing the model’s output logits for token classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.MobileBertForMaskedLM

MobileBertForMaskedLM is a class representing a MobileBERT model for masking task.

Kind: static class of models

mobileBertForMaskedLM._call(model_inputs) ⇒ <code> Promise. < MaskedLMOutput > </code>

Calls the model on new inputs.

Kind: instance method of MobileBertForMaskedLM
Returns: Promise.<MaskedLMOutput> - returned object

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.MobileBertForSequenceClassification

MobileBert Model transformer with a sequence classification/regression head on top (a linear layer on top of the pooled output)

Kind: static class of models

mobileBertForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of MobileBertForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - returned object

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.MobileBertForQuestionAnswering

MobileBert Model with a span classification head on top for extractive question-answering tasks

Kind: static class of models

mobileBertForQuestionAnswering._call(model_inputs) ⇒ <code> Promise. < QuestionAnsweringModelOutput > </code>

Calls the model on new inputs.

Kind: instance method of MobileBertForQuestionAnswering
Returns: Promise.<QuestionAnsweringModelOutput> - returned object

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.MPNetModel

The bare MPNet Model transformer outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.MPNetForMaskedLM

MPNetForMaskedLM is a class representing a MPNet model for masked language modeling.

Kind: static class of models

mpNetForMaskedLM._call(model_inputs) ⇒ <code> Promise. < MaskedLMOutput > </code>

Calls the model on new inputs.

Kind: instance method of MPNetForMaskedLM
Returns: Promise.<MaskedLMOutput> - An object containing the model’s output logits for masked language modeling.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.MPNetForSequenceClassification

MPNetForSequenceClassification is a class representing a MPNet model for sequence classification.

Kind: static class of models

mpNetForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of MPNetForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - An object containing the model’s output logits for sequence classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.MPNetForTokenClassification

MPNetForTokenClassification is a class representing a MPNet model for token classification.

Kind: static class of models

mpNetForTokenClassification._call(model_inputs) ⇒ <code> Promise. < TokenClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of MPNetForTokenClassification
Returns: Promise.<TokenClassifierOutput> - An object containing the model’s output logits for token classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.MPNetForQuestionAnswering

MPNetForQuestionAnswering is a class representing a MPNet model for question answering.

Kind: static class of models

mpNetForQuestionAnswering._call(model_inputs) ⇒ <code> Promise. < QuestionAnsweringModelOutput > </code>

Calls the model on new inputs.

Kind: instance method of MPNetForQuestionAnswering
Returns: Promise.<QuestionAnsweringModelOutput> - An object containing the model’s output logits for question answering.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.T5ForConditionalGeneration

T5Model is a class representing a T5 model for conditional generation.

Kind: static class of models

new T5ForConditionalGeneration(config, session, decoder_merged_session, generation_config)

Creates a new instance of the T5ForConditionalGeneration class.

Param	Type	Description
config	`Object`	The model configuration.
session	`any`	session for the model.
decoder_merged_session	`any`	session for the decoder.
generation_config	`GenerationConfig`	The generation configuration.

models.LongT5PreTrainedModel

An abstract class to handle weights initialization and a simple interface for downloading and loading pretrained models.

Kind: static class of models

models.LongT5Model

The bare LONGT5 Model transformer outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.LongT5ForConditionalGeneration

LONGT5 Model with a language modeling head on top.

Kind: static class of models

new LongT5ForConditionalGeneration(config, session, decoder_merged_session, generation_config)

Creates a new instance of the LongT5ForConditionalGeneration class.

Param	Type	Description
config	`Object`	The model configuration.
session	`any`	session for the model.
decoder_merged_session	`any`	session for the decoder.
generation_config	`GenerationConfig`	The generation configuration.

models.MT5ForConditionalGeneration

A class representing a conditional sequence-to-sequence model based on the MT5 architecture.

Kind: static class of models

new MT5ForConditionalGeneration(config, session, decoder_merged_session, generation_config)

Creates a new instance of the MT5ForConditionalGeneration class.

Param	Type	Description
config	`any`	The model configuration.
session	`any`	The ONNX session containing the encoder weights.
decoder_merged_session	`any`	The ONNX session containing the merged decoder weights.
generation_config	`GenerationConfig`	The generation configuration.

models.BartModel

The bare BART Model outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.BartForConditionalGeneration

The BART Model with a language modeling head. Can be used for summarization.

Kind: static class of models

new BartForConditionalGeneration(config, session, decoder_merged_session, generation_config)

Creates a new instance of the BartForConditionalGeneration class.

Param	Type	Description
config	`Object`	The configuration object for the Bart model.
session	`Object`	The ONNX session used to execute the model.
decoder_merged_session	`Object`	The ONNX session used to execute the decoder.
generation_config	`Object`	The generation configuration object.

models.BartForSequenceClassification

Bart model with a sequence classification/head on top (a linear layer on top of the pooled output)

Kind: static class of models

bartForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of BartForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - An object containing the model’s output logits for sequence classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.MBartModel

The bare MBART Model outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.MBartForConditionalGeneration

The MBART Model with a language modeling head. Can be used for summarization, after fine-tuning the pretrained models.

Kind: static class of models

new MBartForConditionalGeneration(config, session, decoder_merged_session, generation_config)

Creates a new instance of the MBartForConditionalGeneration class.

Param	Type	Description
config	`Object`	The configuration object for the Bart model.
session	`Object`	The ONNX session used to execute the model.
decoder_merged_session	`Object`	The ONNX session used to execute the decoder.
generation_config	`Object`	The generation configuration object.

models.MBartForSequenceClassification

MBart model with a sequence classification/head on top (a linear layer on top of the pooled output).

Kind: static class of models

mBartForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of MBartForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - An object containing the model’s output logits for sequence classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.MBartForCausalLM

Kind: static class of models

new MBartForCausalLM(config, decoder_merged_session, generation_config)

Creates a new instance of the MBartForCausalLM class.

Param	Type	Description
config	`Object`	Configuration object for the model.
decoder_merged_session	`Object`	ONNX Session object for the decoder.
generation_config	`Object`	Configuration object for the generation process.

models.BlenderbotModel

The bare Blenderbot Model outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.BlenderbotForConditionalGeneration

The Blenderbot Model with a language modeling head. Can be used for summarization.

Kind: static class of models

new BlenderbotForConditionalGeneration(config, session, decoder_merged_session, generation_config)

Creates a new instance of the BlenderbotForConditionalGeneration class.

Param	Type	Description
config	`any`	The model configuration.
session	`any`	The ONNX session containing the encoder weights.
decoder_merged_session	`any`	The ONNX session containing the merged decoder weights.
generation_config	`GenerationConfig`	The generation configuration.

models.BlenderbotSmallModel

The bare BlenderbotSmall Model outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.BlenderbotSmallForConditionalGeneration

The BlenderbotSmall Model with a language modeling head. Can be used for summarization.

Kind: static class of models

new BlenderbotSmallForConditionalGeneration(config, session, decoder_merged_session, generation_config)

Creates a new instance of the BlenderbotForConditionalGeneration class.

Param	Type	Description
config	`any`	The model configuration.
session	`any`	The ONNX session containing the encoder weights.
decoder_merged_session	`any`	The ONNX session containing the merged decoder weights.
generation_config	`GenerationConfig`	The generation configuration.

models.RobertaForMaskedLM

RobertaForMaskedLM class for performing masked language modeling on Roberta models.

Kind: static class of models

robertaForMaskedLM._call(model_inputs) ⇒ <code> Promise. < MaskedLMOutput > </code>

Calls the model on new inputs.

Kind: instance method of RobertaForMaskedLM
Returns: Promise.<MaskedLMOutput> - returned object

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.RobertaForSequenceClassification

RobertaForSequenceClassification class for performing sequence classification on Roberta models.

Kind: static class of models

robertaForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of RobertaForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - returned object

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.RobertaForTokenClassification

RobertaForTokenClassification class for performing token classification on Roberta models.

Kind: static class of models

robertaForTokenClassification._call(model_inputs) ⇒ <code> Promise. < TokenClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of RobertaForTokenClassification
Returns: Promise.<TokenClassifierOutput> - An object containing the model’s output logits for token classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.RobertaForQuestionAnswering

RobertaForQuestionAnswering class for performing question answering on Roberta models.

Kind: static class of models

robertaForQuestionAnswering._call(model_inputs) ⇒ <code> Promise. < QuestionAnsweringModelOutput > </code>

Calls the model on new inputs.

Kind: instance method of RobertaForQuestionAnswering
Returns: Promise.<QuestionAnsweringModelOutput> - returned object

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.XLMPreTrainedModel

An abstract class to handle weights initialization and a simple interface for downloading and loading pretrained models.

Kind: static class of models

models.XLMModel

The bare XLM Model transformer outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.XLMWithLMHeadModel

The XLM Model transformer with a language modeling head on top (linear layer with weights tied to the input embeddings).

Kind: static class of models

xlmWithLMHeadModel._call(model_inputs) ⇒ <code> Promise. < MaskedLMOutput > </code>

Calls the model on new inputs.

Kind: instance method of XLMWithLMHeadModel
Returns: Promise.<MaskedLMOutput> - returned object

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.XLMForSequenceClassification

XLM Model with a sequence classification/regression head on top (a linear layer on top of the pooled output)

Kind: static class of models

xlmForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of XLMForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - returned object

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.XLMForTokenClassification

XLM Model with a token classification head on top (a linear layer on top of the hidden-states output)

Kind: static class of models

xlmForTokenClassification._call(model_inputs) ⇒ <code> Promise. < TokenClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of XLMForTokenClassification
Returns: Promise.<TokenClassifierOutput> - An object containing the model’s output logits for token classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.XLMForQuestionAnswering

XLM Model with a span classification head on top for extractive question-answering tasks

Kind: static class of models

xlmForQuestionAnswering._call(model_inputs) ⇒ <code> Promise. < QuestionAnsweringModelOutput > </code>

Calls the model on new inputs.

Kind: instance method of XLMForQuestionAnswering
Returns: Promise.<QuestionAnsweringModelOutput> - returned object

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.XLMRobertaForMaskedLM

XLMRobertaForMaskedLM class for performing masked language modeling on XLMRoberta models.

Kind: static class of models

xlmRobertaForMaskedLM._call(model_inputs) ⇒ <code> Promise. < MaskedLMOutput > </code>

Calls the model on new inputs.

Kind: instance method of XLMRobertaForMaskedLM
Returns: Promise.<MaskedLMOutput> - returned object

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.XLMRobertaForSequenceClassification

XLMRobertaForSequenceClassification class for performing sequence classification on XLMRoberta models.

Kind: static class of models

xlmRobertaForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of XLMRobertaForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - returned object

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.XLMRobertaForTokenClassification

XLMRobertaForTokenClassification class for performing token classification on XLMRoberta models.

Kind: static class of models

xlmRobertaForTokenClassification._call(model_inputs) ⇒ <code> Promise. < TokenClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of XLMRobertaForTokenClassification
Returns: Promise.<TokenClassifierOutput> - An object containing the model’s output logits for token classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.XLMRobertaForQuestionAnswering

XLMRobertaForQuestionAnswering class for performing question answering on XLMRoberta models.

Kind: static class of models

xlmRobertaForQuestionAnswering._call(model_inputs) ⇒ <code> Promise. < QuestionAnsweringModelOutput > </code>

Calls the model on new inputs.

Kind: instance method of XLMRobertaForQuestionAnswering
Returns: Promise.<QuestionAnsweringModelOutput> - returned object

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.ASTModel

The bare AST Model transformer outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.ASTForAudioClassification

Audio Spectrogram Transformer model with an audio classification head on top (a linear layer on top of the pooled output) e.g. for datasets like AudioSet, Speech Commands v2.

Kind: static class of models

models.WhisperModel

WhisperModel class for training Whisper models without a language model head.

Kind: static class of models

models.WhisperForConditionalGeneration

WhisperForConditionalGeneration class for generating conditional outputs from Whisper models.

Kind: static class of models

.WhisperForConditionalGeneration

new WhisperForConditionalGeneration(config, session, decoder_merged_session, generation_config)

Creates a new instance of the WhisperForConditionalGeneration class.

Param	Type	Description
config	`Object`	Configuration object for the model.
session	`Object`	ONNX Session object for the model.
decoder_merged_session	`Object`	ONNX Session object for the decoder.
generation_config	`Object`	Configuration object for the generation process.

whisperForConditionalGeneration.generate(inputs, generation_config, logits_processor) ⇒ <code> Promise. < Object > </code>

Generates outputs based on input and generation configuration.

Kind: instance method of WhisperForConditionalGeneration
Returns: Promise.<Object> - Promise object represents the generated outputs.

Param	Type	Description
inputs	`Object`	Input data for the model.
generation_config	`WhisperGenerationConfig`	Configuration object for the generation process.
logits_processor	`Object`	Optional logits processor object.

whisperForConditionalGeneration._extract_token_timestamps(generate_outputs, alignment_heads, [num_frames], [time_precision]) ⇒ <code> Tensor </code>

Calculates token-level timestamps using the encoder-decoder cross-attentions and dynamic time-warping (DTW) to map each output token to a position in the input audio.

Kind: instance method of WhisperForConditionalGeneration
Returns: Tensor - tensor containing the timestamps in seconds for each predicted token

Param	Type	Default	Description
generate_outputs	`Object`		Outputs generated by the model
generate_outputs.cross_attentions	`Array.<Array<Array<Tensor>>>`		The cross attentions output by the model
generate_outputs.decoder_attentions	`Array.<Array<Array<Tensor>>>`		The decoder attentions output by the model
generate_outputs.sequences	`Array.<Array<number>>`		The sequences output by the model
alignment_heads	`Array.<Array<number>>`		Alignment heads of the model
[num_frames]	`number`		Number of frames in the input audio.
[time_precision]	`number`	`0.02`	Precision of the timestamps in seconds

models.VisionEncoderDecoderModel

Vision Encoder-Decoder model based on OpenAI’s GPT architecture for image captioning and other vision tasks

Kind: static class of models

new VisionEncoderDecoderModel(config, session, decoder_merged_session, generation_config)

Creates a new instance of the VisionEncoderDecoderModel class.

Param	Type	Description
config	`Object`	The configuration object specifying the hyperparameters and other model settings.
session	`Object`	The ONNX session containing the encoder model.
decoder_merged_session	`any`	The ONNX session containing the merged decoder model.
generation_config	`Object`	Configuration object for the generation process.

models.CLIPModel

CLIP Text and Vision Model with a projection layers on top

Example: Perform zero-shot image classification with a CLIPModel.

import { AutoTokenizer, AutoProcessor, CLIPModel, RawImage } from '@xenova/transformers';

// Load tokenizer, processor, and model
let tokenizer = await AutoTokenizer.from_pretrained('Xenova/clip-vit-base-patch16');
let processor = await AutoProcessor.from_pretrained('Xenova/clip-vit-base-patch16');
let model = await CLIPModel.from_pretrained('Xenova/clip-vit-base-patch16');

// Run tokenization
let texts = ['a photo of a car', 'a photo of a football match']
let text_inputs = tokenizer(texts, { padding: true, truncation: true });

// Read image and run processor
let image = await RawImage.read('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/football-match.jpg');
let image_inputs = await processor(image);

// Run model with both text and pixel inputs
let output = await model({ ...text_inputs, ...image_inputs });
// {
//   logits_per_image: Tensor {
//     dims: [ 1, 2 ],
//     data: Float32Array(2) [ 18.579734802246094, 24.31830596923828 ],
//   },
//   logits_per_text: Tensor {
//     dims: [ 2, 1 ],
//     data: Float32Array(2) [ 18.579734802246094, 24.31830596923828 ],
//   },
//   text_embeds: Tensor {
//     dims: [ 2, 512 ],
//     data: Float32Array(1024) [ ... ],
//   },
//   image_embeds: Tensor {
//     dims: [ 1, 512 ],
//     data: Float32Array(512) [ ... ],
//   }
// }

Kind: static class of models

models.CLIPTextModelWithProjection

CLIP Text Model with a projection layer on top (a linear layer on top of the pooled output)

Example: Compute text embeddings with CLIPTextModelWithProjection.

import { AutoTokenizer, CLIPTextModelWithProjection } from '@xenova/transformers';

// Load tokenizer and text model
const tokenizer = await AutoTokenizer.from_pretrained('Xenova/clip-vit-base-patch16');
const text_model = await CLIPTextModelWithProjection.from_pretrained('Xenova/clip-vit-base-patch16');

// Run tokenization
let texts = ['a photo of a car', 'a photo of a football match'];
let text_inputs = tokenizer(texts, { padding: true, truncation: true });

// Compute embeddings
const { text_embeds } = await text_model(text_inputs);
// Tensor {
//   dims: [ 2, 512 ],
//   type: 'float32',
//   data: Float32Array(1024) [ ... ],
//   size: 1024
// }

Kind: static class of models

CLIPTextModelWithProjection.from_pretrained() : <code> PreTrainedModel.from_pretrained </code>

Kind: static method of CLIPTextModelWithProjection

models.CLIPVisionModelWithProjection

CLIP Vision Model with a projection layer on top (a linear layer on top of the pooled output)

Example: Compute vision embeddings with CLIPVisionModelWithProjection.

import { AutoProcessor, CLIPVisionModelWithProjection, RawImage} from '@xenova/transformers';

// Load processor and vision model
const processor = await AutoProcessor.from_pretrained('Xenova/clip-vit-base-patch16');
const vision_model = await CLIPVisionModelWithProjection.from_pretrained('Xenova/clip-vit-base-patch16');

// Read image and run processor
let image = await RawImage.read('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/football-match.jpg');
let image_inputs = await processor(image);

// Compute embeddings
const { image_embeds } = await vision_model(image_inputs);
// Tensor {
//   dims: [ 1, 512 ],
//   type: 'float32',
//   data: Float32Array(512) [ ... ],
//   size: 512
// }

Kind: static class of models

CLIPVisionModelWithProjection.from_pretrained() : <code> PreTrainedModel.from_pretrained </code>

Kind: static method of CLIPVisionModelWithProjection

models.SiglipModel

SigLIP Text and Vision Model with a projection layers on top

Example: Perform zero-shot image classification with a SiglipModel.

import { AutoTokenizer, AutoProcessor, SiglipModel, RawImage } from '@xenova/transformers';

// Load tokenizer, processor, and model
const tokenizer = await AutoTokenizer.from_pretrained('Xenova/siglip-base-patch16-224');
const processor = await AutoProcessor.from_pretrained('Xenova/siglip-base-patch16-224');
const model = await SiglipModel.from_pretrained('Xenova/siglip-base-patch16-224');

// Run tokenization
const texts = ['a photo of 2 cats', 'a photo of 2 dogs'];
const text_inputs = tokenizer(texts, { padding: 'max_length', truncation: true });

// Read image and run processor
const image = await RawImage.read('http://images.cocodataset.org/val2017/000000039769.jpg');
const image_inputs = await processor(image);

// Run model with both text and pixel inputs
const output = await model({ ...text_inputs, ...image_inputs });
// {
//   logits_per_image: Tensor {
//     dims: [ 1, 2 ],
//     data: Float32Array(2) [ -1.6019744873046875, -10.720091819763184 ],
//   },
//   logits_per_text: Tensor {
//     dims: [ 2, 1 ],
//     data: Float32Array(2) [ -1.6019744873046875, -10.720091819763184 ],
//   },
//   text_embeds: Tensor {
//     dims: [ 2, 768 ],
//     data: Float32Array(1536) [ ... ],
//   },
//   image_embeds: Tensor {
//     dims: [ 1, 768 ],
//     data: Float32Array(768) [ ... ],
//   }
// }

Kind: static class of models

models.SiglipTextModel

The text model from SigLIP without any head or projection on top.

Example: Compute text embeddings with SiglipTextModel.

import { AutoTokenizer, SiglipTextModel } from '@xenova/transformers';

// Load tokenizer and text model
const tokenizer = await AutoTokenizer.from_pretrained('Xenova/siglip-base-patch16-224');
const text_model = await SiglipTextModel.from_pretrained('Xenova/siglip-base-patch16-224');

// Run tokenization
const texts = ['a photo of 2 cats', 'a photo of 2 dogs'];
const text_inputs = tokenizer(texts, { padding: 'max_length', truncation: true });

// Compute embeddings
const { pooler_output } = await text_model(text_inputs);
// Tensor {
//   dims: [ 2, 768 ],
//   type: 'float32',
//   data: Float32Array(1536) [ ... ],
//   size: 1536
// }

Kind: static class of models

SiglipTextModel.from_pretrained() : <code> PreTrainedModel.from_pretrained </code>

Kind: static method of SiglipTextModel

models.SiglipVisionModel

The vision model from SigLIP without any head or projection on top.

Example: Compute vision embeddings with SiglipVisionModel.

import { AutoProcessor, SiglipVisionModel, RawImage} from '@xenova/transformers';

// Load processor and vision model
const processor = await AutoProcessor.from_pretrained('Xenova/siglip-base-patch16-224');
const vision_model = await SiglipVisionModel.from_pretrained('Xenova/siglip-base-patch16-224');

// Read image and run processor
const image = await RawImage.read('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/football-match.jpg');
const image_inputs = await processor(image);

// Compute embeddings
const { pooler_output } = await vision_model(image_inputs);
// Tensor {
//   dims: [ 1, 768 ],
//   type: 'float32',
//   data: Float32Array(768) [ ... ],
//   size: 768
// }

Kind: static class of models

SiglipVisionModel.from_pretrained() : <code> PreTrainedModel.from_pretrained </code>

Kind: static method of SiglipVisionModel

models.CLIPSegForImageSegmentation

CLIPSeg model with a Transformer-based decoder on top for zero-shot and one-shot image segmentation.

Example: Perform zero-shot image segmentation with a CLIPSegForImageSegmentation model.

import { AutoTokenizer, AutoProcessor, CLIPSegForImageSegmentation, RawImage } from '@xenova/transformers';

// Load tokenizer, processor, and model
const tokenizer = await AutoTokenizer.from_pretrained('Xenova/clipseg-rd64-refined');
const processor = await AutoProcessor.from_pretrained('Xenova/clipseg-rd64-refined');
const model = await CLIPSegForImageSegmentation.from_pretrained('Xenova/clipseg-rd64-refined');

// Run tokenization
const texts = ['a glass', 'something to fill', 'wood', 'a jar'];
const text_inputs = tokenizer(texts, { padding: true, truncation: true });

// Read image and run processor
const image = await RawImage.read('https://github.com/timojl/clipseg/blob/master/example_image.jpg?raw=true');
const image_inputs = await processor(image);

// Run model with both text and pixel inputs
const { logits } = await model({ ...text_inputs, ...image_inputs });
// logits: Tensor {
//   dims: [4, 352, 352],
//   type: 'float32',
//   data: Float32Array(495616) [ ... ],
//   size: 495616
// }

You can visualize the predictions as follows:

const preds = logits
  .unsqueeze_(1)
  .sigmoid_()
  .mul_(255)
  .round_()
  .to('uint8');

for (let i = 0; i < preds.dims[0]; ++i) {
  const img = RawImage.fromTensor(preds[i]);
  img.save(`prediction_${i}.png`);
}

Kind: static class of models

models.GPT2PreTrainedModel

Kind: static class of models

new GPT2PreTrainedModel(config, session, generation_config)

Creates a new instance of the GPT2PreTrainedModel class.

Param	Type	Description
config	`Object`	The configuration of the model.
session	`any`	The ONNX session containing the model weights.
generation_config	`GenerationConfig`	The generation configuration.

models.GPT2LMHeadModel

GPT-2 language model head on top of the GPT-2 base model. This model is suitable for text generation tasks.

Kind: static class of models

models.GPTNeoPreTrainedModel

Kind: static class of models

new GPTNeoPreTrainedModel(config, session, generation_config)

Creates a new instance of the GPTNeoPreTrainedModel class.

Param	Type	Description
config	`Object`	The configuration of the model.
session	`any`	The ONNX session containing the model weights.
generation_config	`GenerationConfig`	The generation configuration.

models.GPTNeoXPreTrainedModel

Kind: static class of models

new GPTNeoXPreTrainedModel(config, session, generation_config)

Creates a new instance of the GPTNeoXPreTrainedModel class.

Param	Type	Description
config	`Object`	The configuration of the model.
session	`any`	The ONNX session containing the model weights.
generation_config	`GenerationConfig`	The generation configuration.

models.GPTJPreTrainedModel

Kind: static class of models

new GPTJPreTrainedModel(config, session, generation_config)

Creates a new instance of the GPTJPreTrainedModel class.

Param	Type	Description
config	`Object`	The configuration of the model.
session	`any`	The ONNX session containing the model weights.
generation_config	`GenerationConfig`	The generation configuration.

models.GPTBigCodePreTrainedModel

Kind: static class of models

new GPTBigCodePreTrainedModel(config, session, generation_config)

Creates a new instance of the GPTBigCodePreTrainedModel class.

Param	Type	Description
config	`Object`	The configuration of the model.
session	`any`	The ONNX session containing the model weights.
generation_config	`GenerationConfig`	The generation configuration.

models.CodeGenPreTrainedModel

Kind: static class of models

new CodeGenPreTrainedModel(config, session, generation_config)

Creates a new instance of the CodeGenPreTrainedModel class.

Param	Type	Description
config	`Object`	The model configuration object.
session	`Object`	The ONNX session object.
generation_config	`GenerationConfig`	The generation configuration.

models.CodeGenModel

CodeGenModel is a class representing a code generation model without a language model head.

Kind: static class of models

models.CodeGenForCausalLM

CodeGenForCausalLM is a class that represents a code generation model based on the GPT-2 architecture. It extends the CodeGenPreTrainedModel class.

Kind: static class of models

models.LlamaPreTrainedModel

The bare LLama Model outputting raw hidden-states without any specific head on top.

Kind: static class of models

new LlamaPreTrainedModel(config, session, generation_config)

Creates a new instance of the LlamaPreTrainedModel class.

Param	Type	Description
config	`Object`	The model configuration object.
session	`Object`	The ONNX session object.
generation_config	`GenerationConfig`	The generation configuration.

models.LlamaModel

The bare LLaMA Model outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.Qwen2PreTrainedModel

The bare Qwen2 Model outputting raw hidden-states without any specific head on top.

Kind: static class of models

new Qwen2PreTrainedModel(config, session, generation_config)

Creates a new instance of the Qwen2PreTrainedModel class.

Param	Type	Description
config	`Object`	The model configuration object.
session	`Object`	The ONNX session object.
generation_config	`GenerationConfig`	The generation configuration.

models.Qwen2Model

The bare Qwen2 Model outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.PhiPreTrainedModel

Kind: static class of models

new PhiPreTrainedModel(config, session, generation_config)

Creates a new instance of the PhiPreTrainedModel class.

Param	Type	Description
config	`Object`	The model configuration object.
session	`Object`	The ONNX session object.
generation_config	`GenerationConfig`	The generation configuration.

models.PhiModel

The bare Phi Model outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.BloomPreTrainedModel

The Bloom Model transformer with a language modeling head on top (linear layer with weights tied to the input embeddings).

Kind: static class of models

new BloomPreTrainedModel(config, session, generation_config)

Creates a new instance of the BloomPreTrainedModel class.

Param	Type	Description
config	`Object`	The configuration of the model.
session	`any`	The ONNX session containing the model weights.
generation_config	`GenerationConfig`	The generation configuration.

models.BloomModel

The bare Bloom Model transformer outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.BloomForCausalLM

The Bloom Model transformer with a language modeling head on top (linear layer with weights tied to the input embeddings).

Kind: static class of models

models.MptPreTrainedModel

Kind: static class of models

new MptPreTrainedModel(config, session, generation_config)

Creates a new instance of the MptPreTrainedModel class.

Param	Type	Description
config	`Object`	The model configuration object.
session	`Object`	The ONNX session object.
generation_config	`GenerationConfig`	The generation configuration.

models.MptModel

The bare Mpt Model transformer outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.MptForCausalLM

The MPT Model transformer with a language modeling head on top (linear layer with weights tied to the input embeddings).

Kind: static class of models

models.OPTPreTrainedModel

Kind: static class of models

new OPTPreTrainedModel(config, session, generation_config)

Creates a new instance of the OPTPreTrainedModel class.

Param	Type	Description
config	`Object`	The model configuration object.
session	`Object`	The ONNX session object.
generation_config	`GenerationConfig`	The generation configuration.

models.OPTModel

The bare OPT Model outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.OPTForCausalLM

The OPT Model transformer with a language modeling head on top (linear layer with weights tied to the input embeddings).

Kind: static class of models

models.VitMatteForImageMatting

ViTMatte framework leveraging any vision backbone e.g. for ADE20k, CityScapes.

Example: Perform image matting with a VitMatteForImageMatting model.

import { AutoProcessor, VitMatteForImageMatting, RawImage } from '@xenova/transformers';

// Load processor and model
const processor = await AutoProcessor.from_pretrained('Xenova/vitmatte-small-distinctions-646');
const model = await VitMatteForImageMatting.from_pretrained('Xenova/vitmatte-small-distinctions-646');

// Load image and trimap
const image = await RawImage.fromURL('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/vitmatte_image.png');
const trimap = await RawImage.fromURL('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/vitmatte_trimap.png');

// Prepare image + trimap for the model
const inputs = await processor(image, trimap);

// Predict alpha matte
const { alphas } = await model(inputs);
// Tensor {
//   dims: [ 1, 1, 640, 960 ],
//   type: 'float32',
//   size: 614400,
//   data: Float32Array(614400) [ 0.9894027709960938, 0.9970508813858032, ... ]
// }

You can visualize the alpha matte as follows:

import { Tensor, cat } from '@xenova/transformers';

// Visualize predicted alpha matte
const imageTensor = image.toTensor();

// Convert float (0-1) alpha matte to uint8 (0-255)
const alphaChannel = alphas
  .squeeze(0)
  .mul_(255)
  .clamp_(0, 255)
  .round_()
  .to('uint8');

// Concatenate original image with predicted alpha
const imageData = cat([imageTensor, alphaChannel], 0);

// Save output image
const outputImage = RawImage.fromTensor(imageData);
outputImage.save('output.png');

Kind: static class of models

vitMatteForImageMatting._call(model_inputs)

Kind: instance method of VitMatteForImageMatting

Param	Type
model_inputs	`any`

models.DetrObjectDetectionOutput

Kind: static class of models

new DetrObjectDetectionOutput(output)

Param	Type	Description
output	`Object`	The output of the model.
output.logits	`Tensor`	Classification logits (including no-object) for all queries.
output.pred_boxes	`Tensor`	Normalized boxes coordinates for all queries, represented as (center_x, center_y, width, height). These values are normalized in [0, 1], relative to the size of each individual image in the batch (disregarding possible padding).

models.DetrSegmentationOutput

Kind: static class of models

new DetrSegmentationOutput(output)

Param	Type	Description
output	`Object`	The output of the model.
output.logits	`Tensor`	The output logits of the model.
output.pred_boxes	`Tensor`	Predicted boxes.
output.pred_masks	`Tensor`	Predicted masks.

models.TableTransformerModel

The bare Table Transformer Model (consisting of a backbone and encoder-decoder Transformer) outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.TableTransformerForObjectDetection

Table Transformer Model (consisting of a backbone and encoder-decoder Transformer) with object detection heads on top, for tasks such as COCO detection.

Kind: static class of models

tableTransformerForObjectDetection._call(model_inputs)

Kind: instance method of TableTransformerForObjectDetection

Param	Type
model_inputs	`any`

models.ResNetPreTrainedModel

An abstract class to handle weights initialization and a simple interface for downloading and loading pretrained models.

Kind: static class of models

models.ResNetModel

The bare ResNet model outputting raw features without any specific head on top.

Kind: static class of models

models.ResNetForImageClassification

ResNet Model with an image classification head on top (a linear layer on top of the pooled features), e.g. for ImageNet.

Kind: static class of models

resNetForImageClassification._call(model_inputs)

Kind: instance method of ResNetForImageClassification

Param	Type
model_inputs	`any`

models.Swin2SRModel

The bare Swin2SR Model transformer outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.Swin2SRForImageSuperResolution

Swin2SR Model transformer with an upsampler head on top for image super resolution and restoration.

Example: Super-resolution w/ Xenova/swin2SR-classical-sr-x2-64.

import { AutoProcessor, Swin2SRForImageSuperResolution, RawImage } from '@xenova/transformers';

// Load processor and model
const model_id = 'Xenova/swin2SR-classical-sr-x2-64';
const processor = await AutoProcessor.from_pretrained(model_id);
const model = await Swin2SRForImageSuperResolution.from_pretrained(model_id);

// Prepare model inputs
const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/butterfly.jpg';
const image = await RawImage.fromURL(url);
const inputs = await processor(image);

// Run model
const outputs = await model(inputs);

// Convert Tensor to RawImage
const output = outputs.reconstruction.squeeze().clamp_(0, 1).mul_(255).round_().to('uint8');
const outputImage = RawImage.fromTensor(output);
// RawImage {
//   data: Uint8Array(786432) [ 41, 31, 24, ... ],
//   width: 512,
//   height: 512,
//   channels: 3
// }

Kind: static class of models

models.DPTModel

The bare DPT Model transformer outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.DPTForDepthEstimation

DPT Model with a depth estimation head on top (consisting of 3 convolutional layers) e.g. for KITTI, NYUv2.

Example: Depth estimation w/ Xenova/dpt-hybrid-midas.

import { DPTForDepthEstimation, AutoProcessor, RawImage, interpolate, max } from '@xenova/transformers';

// Load model and processor
const model_id = 'Xenova/dpt-hybrid-midas';
const model = await DPTForDepthEstimation.from_pretrained(model_id);
const processor = await AutoProcessor.from_pretrained(model_id);

// Load image from URL
const url = 'http://images.cocodataset.org/val2017/000000039769.jpg';
const image = await RawImage.fromURL(url);

// Prepare image for the model
const inputs = await processor(image);

// Run model
const { predicted_depth } = await model(inputs);

// Interpolate to original size
const prediction = interpolate(predicted_depth, image.size.reverse(), 'bilinear', false);

// Visualize the prediction
const formatted = prediction.mul_(255 / max(prediction.data)[0]).to('uint8');
const depth = RawImage.fromTensor(formatted);
// RawImage {
//   data: Uint8Array(307200) [ 85, 85, 84, ... ],
//   width: 640,
//   height: 480,
//   channels: 1
// }

Kind: static class of models

models.DepthAnythingForDepthEstimation

Depth Anything Model with a depth estimation head on top (consisting of 3 convolutional layers) e.g. for KITTI, NYUv2.

Kind: static class of models

models.GLPNModel

The bare GLPN encoder (Mix-Transformer) outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.GLPNForDepthEstimation

GLPN Model transformer with a lightweight depth estimation head on top e.g. for KITTI, NYUv2.

Example: Depth estimation w/ Xenova/glpn-kitti.

import { GLPNForDepthEstimation, AutoProcessor, RawImage, interpolate, max } from '@xenova/transformers';

// Load model and processor
const model_id = 'Xenova/glpn-kitti';
const model = await GLPNForDepthEstimation.from_pretrained(model_id);
const processor = await AutoProcessor.from_pretrained(model_id);

// Load image from URL
const url = 'http://images.cocodataset.org/val2017/000000039769.jpg';
const image = await RawImage.fromURL(url);

// Prepare image for the model
const inputs = await processor(image);

// Run model
const { predicted_depth } = await model(inputs);

// Interpolate to original size
const prediction = interpolate(predicted_depth, image.size.reverse(), 'bilinear', false);

// Visualize the prediction
const formatted = prediction.mul_(255 / max(prediction.data)[0]).to('uint8');
const depth = RawImage.fromTensor(formatted);
// RawImage {
//   data: Uint8Array(307200) [ 207, 169, 154, ... ],
//   width: 640,
//   height: 480,
//   channels: 1
// }

Kind: static class of models

models.DonutSwinModel

The bare Donut Swin Model transformer outputting raw hidden-states without any specific head on top.

Example: Step-by-step Document Parsing.

import { AutoProcessor, AutoTokenizer, AutoModelForVision2Seq, RawImage } from '@xenova/transformers';

// Choose model to use
const model_id = 'Xenova/donut-base-finetuned-cord-v2';

// Prepare image inputs
const processor = await AutoProcessor.from_pretrained(model_id);
const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/receipt.png';
const image = await RawImage.read(url);
const image_inputs = await processor(image);

// Prepare decoder inputs
const tokenizer = await AutoTokenizer.from_pretrained(model_id);
const task_prompt = '<s_cord-v2>';
const decoder_input_ids = tokenizer(task_prompt, {
  add_special_tokens: false,
}).input_ids;

// Create the model
const model = await AutoModelForVision2Seq.from_pretrained(model_id);

// Run inference
const output = await model.generate(image_inputs.pixel_values, {
  decoder_input_ids,
  max_length: model.config.decoder.max_position_embeddings,
});

// Decode output
const decoded = tokenizer.batch_decode(output)[0];
// <s_cord-v2><s_menu><s_nm> CINNAMON SUGAR</s_nm><s_unitprice> 17,000</s_unitprice><s_cnt> 1 x</s_cnt><s_price> 17,000</s_price></s_menu><s_sub_total><s_subtotal_price> 17,000</s_subtotal_price></s_sub_total><s_total><s_total_price> 17,000</s_total_price><s_cashprice> 20,000</s_cashprice><s_changeprice> 3,000</s_changeprice></s_total></s>

Example: Step-by-step Document Visual Question Answering (DocVQA)

import { AutoProcessor, AutoTokenizer, AutoModelForVision2Seq, RawImage } from '@xenova/transformers';

// Choose model to use
const model_id = 'Xenova/donut-base-finetuned-docvqa';

// Prepare image inputs
const processor = await AutoProcessor.from_pretrained(model_id);
const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/invoice.png';
const image = await RawImage.read(url);
const image_inputs = await processor(image);

// Prepare decoder inputs
const tokenizer = await AutoTokenizer.from_pretrained(model_id);
const question = 'What is the invoice number?';
const task_prompt = `<s_docvqa><s_question>${question}</s_question><s_answer>`;
const decoder_input_ids = tokenizer(task_prompt, {
  add_special_tokens: false,
}).input_ids;

// Create the model
const model = await AutoModelForVision2Seq.from_pretrained(model_id);

// Run inference
const output = await model.generate(image_inputs.pixel_values, {
  decoder_input_ids,
  max_length: model.config.decoder.max_position_embeddings,
});

// Decode output
const decoded = tokenizer.batch_decode(output)[0];
// <s_docvqa><s_question> What is the invoice number?</s_question><s_answer> us-001</s_answer></s>

Kind: static class of models

models.ConvNextModel

The bare ConvNext model outputting raw features without any specific head on top.

Kind: static class of models

models.ConvNextForImageClassification

ConvNext Model with an image classification head on top (a linear layer on top of the pooled features), e.g. for ImageNet.

Kind: static class of models

convNextForImageClassification._call(model_inputs)

Kind: instance method of ConvNextForImageClassification

Param	Type
model_inputs	`any`

models.ConvNextV2Model

The bare ConvNextV2 model outputting raw features without any specific head on top.

Kind: static class of models

models.ConvNextV2ForImageClassification

ConvNextV2 Model with an image classification head on top (a linear layer on top of the pooled features), e.g. for ImageNet.

Kind: static class of models

convNextV2ForImageClassification._call(model_inputs)

Kind: instance method of ConvNextV2ForImageClassification

Param	Type
model_inputs	`any`

models.Dinov2Model

The bare DINOv2 Model transformer outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.Dinov2ForImageClassification

Dinov2 Model transformer with an image classification head on top (a linear layer on top of the final hidden state of the [CLS] token) e.g. for ImageNet.

Kind: static class of models

dinov2ForImageClassification._call(model_inputs)

Kind: instance method of Dinov2ForImageClassification

Param	Type
model_inputs	`any`

models.YolosObjectDetectionOutput

Kind: static class of models

new YolosObjectDetectionOutput(output)

Param	Type	Description
output	`Object`	The output of the model.
output.logits	`Tensor`	Classification logits (including no-object) for all queries.
output.pred_boxes	`Tensor`	Normalized boxes coordinates for all queries, represented as (center_x, center_y, width, height). These values are normalized in [0, 1], relative to the size of each individual image in the batch (disregarding possible padding).

models.SamModel

Segment Anything Model (SAM) for generating segmentation masks, given an input image and optional 2D location and bounding boxes.

Example: Perform mask generation w/ Xenova/sam-vit-base.

import { SamModel, AutoProcessor, RawImage } from '@xenova/transformers';

const model = await SamModel.from_pretrained('Xenova/sam-vit-base');
const processor = await AutoProcessor.from_pretrained('Xenova/sam-vit-base');

const img_url = 'https://huggingface.co/ybelkada/segment-anything/resolve/main/assets/car.png';
const raw_image = await RawImage.read(img_url);
const input_points = [[[450, 600]]] // 2D localization of a window

const inputs = await processor(raw_image, input_points);
const outputs = await model(inputs);

const masks = await processor.post_process_masks(outputs.pred_masks, inputs.original_sizes, inputs.reshaped_input_sizes);
// [
//   Tensor {
//     dims: [ 1, 3, 1764, 2646 ],
//     type: 'bool',
//     data: Uint8Array(14002632) [ ... ],
//     size: 14002632
//   }
// ]
const scores = outputs.iou_scores;
// Tensor {
//   dims: [ 1, 1, 3 ],
//   type: 'float32',
//   data: Float32Array(3) [
//     0.8892380595207214,
//     0.9311248064041138,
//     0.983696699142456
//   ],
//   size: 3
// }

Kind: static class of models

.SamModel
- new SamModel(config, vision_encoder, prompt_encoder_mask_decoder)
- .get_image_embeddings(model_inputs) ⇒ Promise.<{image_embeddings: Tensor, image_positional_embeddings: Tensor}>
- .forward(model_inputs) ⇒ Promise.<Object>
- ._call(model_inputs) ⇒ Promise.<SamImageSegmentationOutput>

new SamModel(config, vision_encoder, prompt_encoder_mask_decoder)

Creates a new instance of the SamModel class.

Param	Type	Description
config	`Object`	The configuration object specifying the hyperparameters and other model settings.
vision_encoder	`Object`	The ONNX session containing the vision encoder model.
prompt_encoder_mask_decoder	`any`	The ONNX session containing the prompt encoder and mask decoder model.

samModel.get_image_embeddings(model_inputs) ⇒ <code> Promise. < {image_embeddings: Tensor, image_positional_embeddings: Tensor} > </code>

Compute image embeddings and positional image embeddings, given the pixel values of an image.

Kind: instance method of SamModel
Returns: Promise.<{image_embeddings: Tensor, image_positional_embeddings: Tensor}> - The image embeddings and positional image embeddings.

Param	Type	Description
model_inputs	`Object`	Object containing the model inputs.
model_inputs.pixel_values	`Tensor`	Pixel values obtained using a `SamProcessor`.

samModel.forward(model_inputs) ⇒ <code> Promise. < Object > </code>

Kind: instance method of SamModel
Returns: Promise.<Object> - The output of the model.

Param	Type	Description
model_inputs	`SamModelInputs`	Object containing the model inputs.

samModel._call(model_inputs) ⇒ <code> Promise. < SamImageSegmentationOutput > </code>

Runs the model with the provided inputs

Kind: instance method of SamModel
Returns: Promise.<SamImageSegmentationOutput> - Object containing segmentation outputs

Param	Type	Description
model_inputs	`Object`	Model inputs

models.SamImageSegmentationOutput

Base class for Segment-Anything model’s output.

Kind: static class of models

new SamImageSegmentationOutput(output)

Param	Type	Description
output	`Object`	The output of the model.
output.iou_scores	`Tensor`	The output logits of the model.
output.pred_masks	`Tensor`	Predicted boxes.

models.MarianMTModel

Kind: static class of models

new MarianMTModel(config, session, decoder_merged_session, generation_config)

Creates a new instance of the MarianMTModel class.

Param	Type	Description
config	`Object`	The model configuration object.
session	`Object`	The ONNX session object.
decoder_merged_session	`any`
generation_config	`any`

models.M2M100ForConditionalGeneration

Kind: static class of models

new M2M100ForConditionalGeneration(config, session, decoder_merged_session, generation_config)

Creates a new instance of the M2M100ForConditionalGeneration class.

Param	Type	Description
config	`Object`	The model configuration object.
session	`Object`	The ONNX session object.
decoder_merged_session	`any`
generation_config	`any`

models.Wav2Vec2Model

The bare Wav2Vec2 Model transformer outputting raw hidden-states without any specific head on top.

Example: Load and run a Wav2Vec2Model for feature extraction.

import { AutoProcessor, AutoModel, read_audio } from '@xenova/transformers';

// Read and preprocess audio
const processor = await AutoProcessor.from_pretrained('Xenova/mms-300m');
const audio = await read_audio('https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac', 16000);
const inputs = await processor(audio);

// Run model with inputs
const model = await AutoModel.from_pretrained('Xenova/mms-300m');
const output = await model(inputs);
// {
//   last_hidden_state: Tensor {
//     dims: [ 1, 1144, 1024 ],
//     type: 'float32',
//     data: Float32Array(1171456) [ ... ],
//     size: 1171456
//   }
// }

Kind: static class of models

models.Wav2Vec2ForAudioFrameClassification

Wav2Vec2 Model with a frame classification head on top for tasks like Speaker Diarization.

Kind: static class of models

wav2Vec2ForAudioFrameClassification._call(model_inputs) ⇒ <code> Promise. < TokenClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of Wav2Vec2ForAudioFrameClassification
Returns: Promise.<TokenClassifierOutput> - An object containing the model’s output logits for sequence classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.UniSpeechModel

The bare UniSpeech Model transformer outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.UniSpeechForCTC

UniSpeech Model with a language modeling head on top for Connectionist Temporal Classification (CTC).

Kind: static class of models

uniSpeechForCTC._call(model_inputs)

Kind: instance method of UniSpeechForCTC

Param	Type	Description
model_inputs	`Object`
model_inputs.input_values	`Tensor`	Float values of input raw speech waveform.
model_inputs.attention_mask	`Tensor`	Mask to avoid performing convolution and attention on padding token indices. Mask values selected in [0, 1]

models.UniSpeechForSequenceClassification

UniSpeech Model with a sequence classification head on top (a linear layer over the pooled output).

Kind: static class of models

uniSpeechForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of UniSpeechForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - An object containing the model’s output logits for sequence classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.UniSpeechSatModel

The bare UniSpeechSat Model transformer outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.UniSpeechSatForCTC

UniSpeechSat Model with a language modeling head on top for Connectionist Temporal Classification (CTC).

Kind: static class of models

uniSpeechSatForCTC._call(model_inputs)

Kind: instance method of UniSpeechSatForCTC

Param	Type	Description
model_inputs	`Object`
model_inputs.input_values	`Tensor`	Float values of input raw speech waveform.
model_inputs.attention_mask	`Tensor`	Mask to avoid performing convolution and attention on padding token indices. Mask values selected in [0, 1]

models.UniSpeechSatForSequenceClassification

UniSpeechSat Model with a sequence classification head on top (a linear layer over the pooled output).

Kind: static class of models

uniSpeechSatForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of UniSpeechSatForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - An object containing the model’s output logits for sequence classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.UniSpeechSatForAudioFrameClassification

UniSpeechSat Model with a frame classification head on top for tasks like Speaker Diarization.

Kind: static class of models

uniSpeechSatForAudioFrameClassification._call(model_inputs) ⇒ <code> Promise. < TokenClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of UniSpeechSatForAudioFrameClassification
Returns: Promise.<TokenClassifierOutput> - An object containing the model’s output logits for sequence classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.Wav2Vec2BertModel

The bare Wav2Vec2Bert Model transformer outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.Wav2Vec2BertForCTC

Wav2Vec2Bert Model with a language modeling head on top for Connectionist Temporal Classification (CTC).

Kind: static class of models

wav2Vec2BertForCTC._call(model_inputs)

Kind: instance method of Wav2Vec2BertForCTC

Param	Type	Description
model_inputs	`Object`
model_inputs.input_features	`Tensor`	Float values of input mel-spectrogram.
model_inputs.attention_mask	`Tensor`	Mask to avoid performing convolution and attention on padding token indices. Mask values selected in [0, 1]

models.Wav2Vec2BertForSequenceClassification

Wav2Vec2Bert Model with a sequence classification head on top (a linear layer over the pooled output).

Kind: static class of models

wav2Vec2BertForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of Wav2Vec2BertForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - An object containing the model’s output logits for sequence classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.HubertModel

The bare Hubert Model transformer outputting raw hidden-states without any specific head on top.

Example: Load and run a HubertModel for feature extraction.

import { AutoProcessor, AutoModel, read_audio } from '@xenova/transformers';

// Read and preprocess audio
const processor = await AutoProcessor.from_pretrained('Xenova/hubert-base-ls960');
const audio = await read_audio('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/jfk.wav', 16000);
const inputs = await processor(audio);

// Load and run model with inputs
const model = await AutoModel.from_pretrained('Xenova/hubert-base-ls960');
const output = await model(inputs);
// {
//   last_hidden_state: Tensor {
//     dims: [ 1, 549, 768 ],
//     type: 'float32',
//     data: Float32Array(421632) [0.0682469978928566, 0.08104046434164047, -0.4975186586380005, ...],
//     size: 421632
//   }
// }

Kind: static class of models

models.HubertForCTC

Hubert Model with a language modeling head on top for Connectionist Temporal Classification (CTC).

Kind: static class of models

hubertForCTC._call(model_inputs)

Kind: instance method of HubertForCTC

Param	Type	Description
model_inputs	`Object`
model_inputs.input_values	`Tensor`	Float values of input raw speech waveform.
model_inputs.attention_mask	`Tensor`	Mask to avoid performing convolution and attention on padding token indices. Mask values selected in [0, 1]

models.HubertForSequenceClassification

Hubert Model with a sequence classification head on top (a linear layer over the pooled output) for tasks like SUPERB Keyword Spotting.

Kind: static class of models

hubertForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of HubertForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - An object containing the model’s output logits for sequence classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.WavLMPreTrainedModel

An abstract class to handle weights initialization and a simple interface for downloading and loading pretrained models.

Kind: static class of models

models.WavLMModel

The bare WavLM Model transformer outputting raw hidden-states without any specific head on top.

Example: Load and run a WavLMModel for feature extraction.

import { AutoProcessor, AutoModel, read_audio } from '@xenova/transformers';

// Read and preprocess audio
const processor = await AutoProcessor.from_pretrained('Xenova/wavlm-base');
const audio = await read_audio('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/jfk.wav', 16000);
const inputs = await processor(audio);

// Run model with inputs
const model = await AutoModel.from_pretrained('Xenova/wavlm-base');
const output = await model(inputs);
// {
//   last_hidden_state: Tensor {
//     dims: [ 1, 549, 768 ],
//     type: 'float32',
//     data: Float32Array(421632) [-0.349443256855011, -0.39341306686401367,  0.022836603224277496, ...],
//     size: 421632
//   }
// }

Kind: static class of models

models.WavLMForCTC

WavLM Model with a language modeling head on top for Connectionist Temporal Classification (CTC).

Kind: static class of models

wavLMForCTC._call(model_inputs)

Kind: instance method of WavLMForCTC

Param	Type	Description
model_inputs	`Object`
model_inputs.input_values	`Tensor`	Float values of input raw speech waveform.
model_inputs.attention_mask	`Tensor`	Mask to avoid performing convolution and attention on padding token indices. Mask values selected in [0, 1]

models.WavLMForSequenceClassification

WavLM Model with a sequence classification head on top (a linear layer over the pooled output).

Kind: static class of models

wavLMForSequenceClassification._call(model_inputs) ⇒ <code> Promise. < SequenceClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of WavLMForSequenceClassification
Returns: Promise.<SequenceClassifierOutput> - An object containing the model’s output logits for sequence classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.WavLMForXVector

WavLM Model with an XVector feature extraction head on top for tasks like Speaker Verification.

Example: Extract speaker embeddings with WavLMForXVector.

import { AutoProcessor, AutoModel, read_audio } from '@xenova/transformers';

// Read and preprocess audio
const processor = await AutoProcessor.from_pretrained('Xenova/wavlm-base-plus-sv');
const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/jfk.wav';
const audio = await read_audio(url, 16000);
const inputs = await processor(audio);

// Run model with inputs
const model = await AutoModel.from_pretrained('Xenova/wavlm-base-plus-sv');
const outputs = await model(inputs);
// {
//   logits: Tensor {
//     dims: [ 1, 512 ],
//     type: 'float32',
//     data: Float32Array(512) [0.5847219228744507, ...],
//     size: 512
//   },
//   embeddings: Tensor {
//     dims: [ 1, 512 ],
//     type: 'float32',
//     data: Float32Array(512) [-0.09079201519489288, ...],
//     size: 512
//   }
// }

Kind: static class of models

wavLMForXVector._call(model_inputs) ⇒ <code> Promise. < XVectorOutput > </code>

Calls the model on new inputs.

Kind: instance method of WavLMForXVector
Returns: Promise.<XVectorOutput> - An object containing the model’s output logits and speaker embeddings.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.WavLMForAudioFrameClassification

WavLM Model with a frame classification head on top for tasks like Speaker Diarization.

Example: Perform speaker diarization with WavLMForAudioFrameClassification.

import { AutoProcessor, AutoModelForAudioFrameClassification, read_audio } from '@xenova/transformers';

// Read and preprocess audio
const processor = await AutoProcessor.from_pretrained('Xenova/wavlm-base-plus-sd');
const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/jfk.wav';
const audio = await read_audio(url, 16000);
const inputs = await processor(audio);

// Run model with inputs
const model = await AutoModelForAudioFrameClassification.from_pretrained('Xenova/wavlm-base-plus-sd');
const { logits } = await model(inputs);
// {
//   logits: Tensor {
//     dims: [ 1, 549, 2 ],  // [batch_size, num_frames, num_speakers]
//     type: 'float32',
//     data: Float32Array(1098) [-3.5301010608673096, ...],
//     size: 1098
//   }
// }

const labels = logits[0].sigmoid().tolist().map(
    frames => frames.map(speaker => speaker > 0.5 ? 1 : 0)
);
console.log(labels); // labels is a one-hot array of shape (num_frames, num_speakers)
// [
//     [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0],
//     [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0],
//     [0, 0], [0, 1], [0, 1], [0, 1], [0, 1], [0, 1],
//     ...
// ]

Kind: static class of models

wavLMForAudioFrameClassification._call(model_inputs) ⇒ <code> Promise. < TokenClassifierOutput > </code>

Calls the model on new inputs.

Kind: instance method of WavLMForAudioFrameClassification
Returns: Promise.<TokenClassifierOutput> - An object containing the model’s output logits for sequence classification.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.SpeechT5PreTrainedModel

An abstract class to handle weights initialization and a simple interface for downloading and loading pretrained models.

Kind: static class of models

models.SpeechT5Model

The bare SpeechT5 Encoder-Decoder Model outputting raw hidden-states without any specific pre- or post-nets.

Kind: static class of models

models.SpeechT5ForSpeechToText

SpeechT5 Model with a speech encoder and a text decoder.

Example: Generate speech from text with SpeechT5ForSpeechToText.

import { AutoTokenizer, AutoProcessor, SpeechT5ForTextToSpeech, SpeechT5HifiGan, Tensor } from '@xenova/transformers';

// Load the tokenizer and processor
const tokenizer = await AutoTokenizer.from_pretrained('Xenova/speecht5_tts');
const processor = await AutoProcessor.from_pretrained('Xenova/speecht5_tts');

// Load the models
// NOTE: We use the unquantized versions as they are more accurate
const model = await SpeechT5ForTextToSpeech.from_pretrained('Xenova/speecht5_tts', { quantized: false });
const vocoder = await SpeechT5HifiGan.from_pretrained('Xenova/speecht5_hifigan', { quantized: false });

// Load speaker embeddings from URL
const speaker_embeddings_data = new Float32Array(
    await (await fetch('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/speaker_embeddings.bin')).arrayBuffer()
);
const speaker_embeddings = new Tensor(
    'float32',
    speaker_embeddings_data,
    [1, speaker_embeddings_data.length]
)

// Run tokenization
const { input_ids } = tokenizer('Hello, my dog is cute');

// Generate waveform
const { waveform } = await model.generate_speech(input_ids, speaker_embeddings, { vocoder });
console.log(waveform)
// Tensor {
//   dims: [ 26112 ],
//   type: 'float32',
//   size: 26112,
//   data: Float32Array(26112) [ -0.00043630177970044315, -0.00018082228780258447, ... ],
// }

Kind: static class of models

models.SpeechT5ForTextToSpeech

SpeechT5 Model with a text encoder and a speech decoder.

Kind: static class of models

.SpeechT5ForTextToSpeech
- new SpeechT5ForTextToSpeech(config, session, decoder_merged_session, generation_config)
- .generate_speech(input_values, speaker_embeddings, options) ⇒ Promise.<SpeechOutput>

new SpeechT5ForTextToSpeech(config, session, decoder_merged_session, generation_config)

Creates a new instance of the SpeechT5ForTextToSpeech class.

Param	Type	Description
config	`Object`	The model configuration.
session	`any`	session for the model.
decoder_merged_session	`any`	session for the decoder.
generation_config	`GenerationConfig`	The generation configuration.

speechT5ForTextToSpeech.generate_speech(input_values, speaker_embeddings, options) ⇒ <code> Promise. < SpeechOutput > </code>

Converts a sequence of input tokens into a sequence of mel spectrograms, which are subsequently turned into a speech waveform using a vocoder.

Kind: instance method of SpeechT5ForTextToSpeech
Returns: Promise.<SpeechOutput> - A promise which resolves to an object containing the spectrogram, waveform, and cross-attention tensors.

Param	Type	Default	Description
input_values	`Tensor`		Indices of input sequence tokens in the vocabulary.
speaker_embeddings	`Tensor`		Tensor containing the speaker embeddings.
options	`Object`		Optional parameters for generating speech.
[options.threshold]	`number`	`0.5`	The generated sequence ends when the predicted stop token probability exceeds this value.
[options.minlenratio]	`number`	`0.0`	Used to calculate the minimum required length for the output sequence.
[options.maxlenratio]	`number`	`20.0`	Used to calculate the maximum allowed length for the output sequence.
[options.vocoder]	`Object`		The vocoder that converts the mel spectrogram into a speech waveform. If `null`, the output is the mel spectrogram.
[options.output_cross_attentions]	`boolean`	`false`	Whether or not to return the attentions tensors of the decoder's cross-attention layers.

models.SpeechT5HifiGan

HiFi-GAN vocoder.

See SpeechT5ForSpeechToText for example usage.

Kind: static class of models

models.TrOCRPreTrainedModel

Kind: static class of models

new TrOCRPreTrainedModel(config, session, generation_config)

Creates a new instance of the TrOCRPreTrainedModel class.

Param	Type	Description
config	`Object`	The configuration of the model.
session	`any`	The ONNX session containing the model weights.
generation_config	`GenerationConfig`	The generation configuration.

models.TrOCRForCausalLM

The TrOCR Decoder with a language modeling head.

Kind: static class of models

models.MistralPreTrainedModel

The bare Mistral Model outputting raw hidden-states without any specific head on top.

Kind: static class of models

new MistralPreTrainedModel(config, session, generation_config)

Creates a new instance of the MistralPreTrainedModel class.

Param	Type	Description
config	`Object`	The configuration of the model.
session	`any`	The ONNX session containing the model weights.
generation_config	`GenerationConfig`	The generation configuration.

models.Starcoder2PreTrainedModel

The bare Starcoder2 Model outputting raw hidden-states without any specific head on top.

Kind: static class of models

new Starcoder2PreTrainedModel(config, session, generation_config)

Creates a new instance of the Starcoder2PreTrainedModel class.

Param	Type	Description
config	`Object`	The configuration of the model.
session	`any`	The ONNX session containing the model weights.
generation_config	`GenerationConfig`	The generation configuration.

models.FalconPreTrainedModel

The bare Falcon Model outputting raw hidden-states without any specific head on top.

Kind: static class of models

new FalconPreTrainedModel(config, session, generation_config)

Creates a new instance of the FalconPreTrainedModel class.

Param	Type	Description
config	`Object`	The configuration of the model.
session	`any`	The ONNX session containing the model weights.
generation_config	`GenerationConfig`	The generation configuration.

models.ClapTextModelWithProjection

CLAP Text Model with a projection layer on top (a linear layer on top of the pooled output).

Example: Compute text embeddings with ClapTextModelWithProjection.

import { AutoTokenizer, ClapTextModelWithProjection } from '@xenova/transformers';

// Load tokenizer and text model
const tokenizer = await AutoTokenizer.from_pretrained('Xenova/clap-htsat-unfused');
const text_model = await ClapTextModelWithProjection.from_pretrained('Xenova/clap-htsat-unfused');

// Run tokenization
const texts = ['a sound of a cat', 'a sound of a dog'];
const text_inputs = tokenizer(texts, { padding: true, truncation: true });

// Compute embeddings
const { text_embeds } = await text_model(text_inputs);
// Tensor {
//   dims: [ 2, 512 ],
//   type: 'float32',
//   data: Float32Array(1024) [ ... ],
//   size: 1024
// }

Kind: static class of models

ClapTextModelWithProjection.from_pretrained() : <code> PreTrainedModel.from_pretrained </code>

Kind: static method of ClapTextModelWithProjection

models.ClapAudioModelWithProjection

CLAP Audio Model with a projection layer on top (a linear layer on top of the pooled output).

Example: Compute audio embeddings with ClapAudioModelWithProjection.

import { AutoProcessor, ClapAudioModelWithProjection, read_audio } from '@xenova/transformers';

// Load processor and audio model
const processor = await AutoProcessor.from_pretrained('Xenova/clap-htsat-unfused');
const audio_model = await ClapAudioModelWithProjection.from_pretrained('Xenova/clap-htsat-unfused');

// Read audio and run processor
const audio = await read_audio('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/cat_meow.wav');
const audio_inputs = await processor(audio);

// Compute embeddings
const { audio_embeds } = await audio_model(audio_inputs);
// Tensor {
//   dims: [ 1, 512 ],
//   type: 'float32',
//   data: Float32Array(512) [ ... ],
//   size: 512
// }

Kind: static class of models

ClapAudioModelWithProjection.from_pretrained() : <code> PreTrainedModel.from_pretrained </code>

Kind: static method of ClapAudioModelWithProjection

models.VitsModel

The complete VITS model, for text-to-speech synthesis.

Example: Generate speech from text with VitsModel.

import { AutoTokenizer, VitsModel } from '@xenova/transformers';

// Load the tokenizer and model
const tokenizer = await AutoTokenizer.from_pretrained('Xenova/mms-tts-eng');
const model = await VitsModel.from_pretrained('Xenova/mms-tts-eng');

// Run tokenization
const inputs = tokenizer('I love transformers');

// Generate waveform
const { waveform } = await model(inputs);
// Tensor {
//   dims: [ 1, 35328 ],
//   type: 'float32',
//   data: Float32Array(35328) [ ... ],
//   size: 35328,
// }

Kind: static class of models

vitsModel._call(model_inputs) ⇒ <code> Promise. < VitsModelOutput > </code>

Calls the model on new inputs.

Kind: instance method of VitsModel
Returns: Promise.<VitsModelOutput> - The outputs for the VITS model.

Param	Type	Description
model_inputs	`Object`	The inputs to the model.

models.SegformerModel

The bare SegFormer encoder (Mix-Transformer) outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.SegformerForImageClassification

SegFormer Model transformer with an image classification head on top (a linear layer on top of the final hidden states) e.g. for ImageNet.

Kind: static class of models

models.SegformerForSemanticSegmentation

SegFormer Model transformer with an all-MLP decode head on top e.g. for ADE20k, CityScapes.

Kind: static class of models

models.StableLmPreTrainedModel

Kind: static class of models

new StableLmPreTrainedModel(config, session, generation_config)

Creates a new instance of the StableLmPreTrainedModel class.

Param	Type	Description
config	`Object`	The configuration of the model.
session	`any`	The ONNX session containing the model weights.
generation_config	`GenerationConfig`	The generation configuration.

models.StableLmModel

The bare StableLm Model transformer outputting raw hidden-states without any specific head on top.

Kind: static class of models

models.StableLmForCausalLM

StableLm Model with a language modeling head on top for Causal Language Modeling (with past).

Kind: static class of models

models.EfficientNetModel

The bare EfficientNet model outputting raw features without any specific head on top.

Kind: static class of models

models.EfficientNetForImageClassification

EfficientNet Model with an image classification head on top (a linear layer on top of the pooled features).

Kind: static class of models

efficientNetForImageClassification._call(model_inputs)

Kind: instance method of EfficientNetForImageClassification

Param	Type
model_inputs	`any`

models.PretrainedMixin

Base class of all AutoModels. Contains the from_pretrained function which is used to instantiate pretrained models.

Kind: static class of models

.PretrainedMixin
- instance
  - .MODEL_CLASS_MAPPINGS : *
  - .BASE_IF_FAIL
- static
  - .from_pretrained() : PreTrainedModel.from_pretrained

pretrainedMixin.MODEL_CLASS_MAPPINGS : <code> * </code>

Mapping from model type to model class.

Kind: instance property of PretrainedMixin

pretrainedMixin.BASE_IF_FAIL

Whether to attempt to instantiate the base class (PretrainedModel) if the model type is not found in the mapping.

Kind: instance property of PretrainedMixin

PretrainedMixin.from_pretrained() : <code> PreTrainedModel.from_pretrained </code>

Kind: static method of PretrainedMixin

models.AutoModel

Helper class which is used to instantiate pretrained models with the from_pretrained function. The chosen model class is determined by the type specified in the model config.

Kind: static class of models

autoModel.MODEL_CLASS_MAPPINGS : <code> * </code>

Kind: instance property of AutoModel

models.AutoModelForSequenceClassification

Helper class which is used to instantiate pretrained sequence classification models with the from_pretrained function. The chosen model class is determined by the type specified in the model config.

Kind: static class of models

models.AutoModelForTokenClassification

Helper class which is used to instantiate pretrained token classification models with the from_pretrained function. The chosen model class is determined by the type specified in the model config.

Kind: static class of models

models.AutoModelForSeq2SeqLM

Helper class which is used to instantiate pretrained sequence-to-sequence models with the from_pretrained function. The chosen model class is determined by the type specified in the model config.

Kind: static class of models

models.AutoModelForSpeechSeq2Seq

Helper class which is used to instantiate pretrained sequence-to-sequence speech-to-text models with the from_pretrained function. The chosen model class is determined by the type specified in the model config.

Kind: static class of models

models.AutoModelForTextToSpectrogram

Helper class which is used to instantiate pretrained sequence-to-sequence text-to-spectrogram models with the from_pretrained function. The chosen model class is determined by the type specified in the model config.

Kind: static class of models

models.AutoModelForTextToWaveform

Helper class which is used to instantiate pretrained text-to-waveform models with the from_pretrained function. The chosen model class is determined by the type specified in the model config.

Kind: static class of models

models.AutoModelForCausalLM

Helper class which is used to instantiate pretrained causal language models with the from_pretrained function. The chosen model class is determined by the type specified in the model config.

Kind: static class of models

models.AutoModelForMaskedLM

Helper class which is used to instantiate pretrained masked language models with the from_pretrained function. The chosen model class is determined by the type specified in the model config.

Kind: static class of models

models.AutoModelForQuestionAnswering

Helper class which is used to instantiate pretrained question answering models with the from_pretrained function. The chosen model class is determined by the type specified in the model config.

Kind: static class of models

models.AutoModelForVision2Seq

Helper class which is used to instantiate pretrained vision-to-sequence models with the from_pretrained function. The chosen model class is determined by the type specified in the model config.

Kind: static class of models

models.AutoModelForImageClassification

Helper class which is used to instantiate pretrained image classification models with the from_pretrained function. The chosen model class is determined by the type specified in the model config.

Kind: static class of models

models.AutoModelForImageSegmentation

Helper class which is used to instantiate pretrained image segmentation models with the from_pretrained function. The chosen model class is determined by the type specified in the model config.

Kind: static class of models

models.AutoModelForSemanticSegmentation

Helper class which is used to instantiate pretrained image segmentation models with the from_pretrained function. The chosen model class is determined by the type specified in the model config.

Kind: static class of models

models.AutoModelForObjectDetection

Helper class which is used to instantiate pretrained object detection models with the from_pretrained function. The chosen model class is determined by the type specified in the model config.

Kind: static class of models

models.AutoModelForMaskGeneration

Helper class which is used to instantiate pretrained mask generation models with the from_pretrained function. The chosen model class is determined by the type specified in the model config.

Kind: static class of models

models.Seq2SeqLMOutput

Kind: static class of models

new Seq2SeqLMOutput(output)

Param	Type	Description
output	`Object`	The output of the model.
output.logits	`Tensor`	The output logits of the model.
output.past_key_values	`Tensor`	An tensor of key/value pairs that represent the previous state of the model.
output.encoder_outputs	`Tensor`	The output of the encoder in a sequence-to-sequence model.
[output.decoder_attentions]	`Tensor`	Attentions weights of the decoder, after the attention softmax, used to compute the weighted average in the self-attention heads.
[output.cross_attentions]	`Tensor`	Attentions weights of the decoder's cross-attention layer, after the attention softmax, used to compute the weighted average in the cross-attention heads.

models.SequenceClassifierOutput

Base class for outputs of sentence classification models.

Kind: static class of models

new SequenceClassifierOutput(output)

Param	Type	Description
output	`Object`	The output of the model.
output.logits	`Tensor`	classification (or regression if config.num_labels==1) scores (before SoftMax).

models.XVectorOutput

Base class for outputs of XVector models.

Kind: static class of models

new XVectorOutput(output)

Param	Type	Description
output	`Object`	The output of the model.
output.logits	`Tensor`	Classification hidden states before AMSoftmax, of shape `(batch_size, config.xvector_output_dim)`.
output.embeddings	`Tensor`	Utterance embeddings used for vector similarity-based retrieval, of shape `(batch_size, config.xvector_output_dim)`.

models.TokenClassifierOutput

Base class for outputs of token classification models.

Kind: static class of models

new TokenClassifierOutput(output)

Param	Type	Description
output	`Object`	The output of the model.
output.logits	`Tensor`	Classification scores (before SoftMax).

models.MaskedLMOutput

Base class for masked language models outputs.

Kind: static class of models

new MaskedLMOutput(output)

Param	Type	Description
output	`Object`	The output of the model.
output.logits	`Tensor`	Prediction scores of the language modeling head (scores for each vocabulary token before SoftMax).

models.QuestionAnsweringModelOutput

Base class for outputs of question answering models.

Kind: static class of models

new QuestionAnsweringModelOutput(output)

Param	Type	Description
output	`Object`	The output of the model.
output.start_logits	`Tensor`	Span-start scores (before SoftMax).
output.end_logits	`Tensor`	Span-end scores (before SoftMax).

models.CausalLMOutput

Base class for causal language model (or autoregressive) outputs.

Kind: static class of models

new CausalLMOutput(output)

Param	Type	Description
output	`Object`	The output of the model.
output.logits	`Tensor`	Prediction scores of the language modeling head (scores for each vocabulary token before softmax).

models.CausalLMOutputWithPast

Base class for causal language model (or autoregressive) outputs.

Kind: static class of models

new CausalLMOutputWithPast(output)

Param	Type	Description
output	`Object`	The output of the model.
output.logits	`Tensor`	Prediction scores of the language modeling head (scores for each vocabulary token before softmax).
output.past_key_values	`Tensor`	Contains pre-computed hidden-states (key and values in the self-attention blocks) that can be used (see `past_key_values` input) to speed up sequential decoding.

models.ImageMattingOutput

Kind: static class of models

new ImageMattingOutput(output)

Param	Type	Description
output	`Object`	The output of the model.
output.alphas	`Tensor`	Estimated alpha values, of shape `(batch_size, num_channels, height, width)`.

models.VitsModelOutput

Describes the outputs for the VITS model.

Kind: static class of models

new VitsModelOutput(output)

Param	Type	Description
output	`Object`	The output of the model.
output.waveform	`Tensor`	The final audio waveform predicted by the model, of shape `(batch_size, sequence_length)`.
output.spectrogram	`Tensor`	The log-mel spectrogram predicted at the output of the flow model. This spectrogram is passed to the Hi-Fi GAN decoder model to obtain the final audio waveform.

models~InferenceSession : <code> * </code>

Kind: inner typedef of models

models~TypedArray : <code> * </code>

Kind: inner typedef of models

models~DecoderOutput ⇒ <code> Promise. < (Array < Array < number > > |EncoderDecoderOutput|DecoderOutput) > </code>

Generates text based on the given inputs and generation configuration using the model.

Kind: inner typedef of models
Returns: Promise.<(Array<Array<number>>|EncoderDecoderOutput|DecoderOutput)> - An array of generated output sequences, where each sequence is an array of token IDs.
Throws:

Error Throws an error if the inputs array is empty.

Param	Type	Description
inputs	`Tensor` \| `Array` \| `TypedArray`	An array of input token IDs.
generation_config	`Object` \| `GenerationConfig` \| `null`	The generation configuration to use. If null, default configuration will be used.
logits_processor	`Object` \| `null`	An optional logits processor to use. If null, a new LogitsProcessorList instance will be created.
options	`Object`	options
[options.inputs_attention_mask]	`Object`	An optional attention mask for the inputs.

models~WhisperGenerationConfig : <code> Object </code>

Kind: inner typedef of models
Extends: GenerationConfig
Properties

Name	Type	Description
[return_timestamps]	`boolean`	Whether to return the timestamps with the text. This enables the `WhisperTimestampsLogitsProcessor`.
[return_token_timestamps]	`boolean`	Whether to return token-level timestamps with the text. This can be used with or without the `return_timestamps` option. To get word-level timestamps, use the tokenizer to group the tokens into words.
[num_frames]	`number`	The number of audio frames available in this chunk. This is only used generating word-level timestamps.

models~SamModelInputs : <code> Object </code>

Object containing the model inputs.

Kind: inner typedef of models
Properties

Name	Type	Description
pixel_values	`Tensor`	Pixel values as a Tensor with shape `(batch_size, num_channels, height, width)`. These can be obtained using a `SamProcessor`.
input_points	`Tensor`	Input 2D spatial points with shape `(batch_size, num_points, 2)`. This is used by the prompt encoder to encode the prompt.
[input_labels]	`Tensor`	Input labels for the points, as a Tensor of shape `(batch_size, point_batch_size, num_points)`. This is used by the prompt encoder to encode the prompt. There are 4 types of labels: `1`: the point is a point that contains the object of interest `0`: the point is a point that does not contain the object of interest `-1`: the point corresponds to the background `-10`: the point is a padding point, thus should be ignored by the prompt encoder
[image_embeddings]	`Tensor`	Image embeddings used by the mask decoder.
[image_positional_embeddings]	`Tensor`	Image positional embeddings used by the mask decoder.

models~SpeechOutput : <code> Object </code>

Kind: inner typedef of models
Properties

Name	Type	Description
[spectrogram]	`Tensor`	The predicted log-mel spectrogram of shape `(output_sequence_length, config.num_mel_bins)`. Returned when no `vocoder` is provided
[waveform]	`Tensor`	The predicted waveform of shape `(num_frames,)`. Returned when a `vocoder` is provided.
[cross_attentions]	`Tensor`	The outputs of the decoder's cross-attention layers of shape `(config.decoder_layers, config.decoder_attention_heads, output_sequence_length, input_sequence_length)`. returned when `output_cross_attentions` is `true`.

←Pipelines Tokenizers→