Trouble to use it

#1
by charlycop - opened

Hi, can you help me with an example, how to use it in my webapp ? I was able to replicate your example with NLLB, but I would like to use the OPUS small models.

Thanks.

Hi there - Here's some example code:

import { pipeline } from '@xenova/transformers';

const generator = await pipeline('translation', 'Xenova/opus-mt-en-es');
const output = await generator('How are you?', {
    top_k: 0,
    do_sample: false,
    num_beams: 1,
});
console.log(output);
// [{ translation_text: '¿Cómo estás?' }]

Hope that helps!

Hope that helps!

Yes, that helps a lot, thanks, I'll try later.

But my questions :

  1. How you know you should use the translation keyword in the pipline argument for this specific model ?
  2. How I know, or I can specify which .onnx file will be downloaded ? I checked the json files, and I can't see where this information is given.

Thanks a lot for your help, I know my questions could be naïve, but I come from the embedded world, and JS is so high level, I have some trouble to wrap my brain around it.

Charly.

It's working thanks A LOT !! If you can answer to my previous questions, it will be very welcome :)

How you know you should use the translation keyword in the pipline argument for this specific model ?

That's just a convention. Have a look at the documentation?
https://huggingface.co/docs/transformers.js/index

Transformers.js tries to follow convertations from the Python version of Transformers, so that documentation can also be surprisingly useful.

How I know, or I can specify which .onnx file will be downloaded ? I checked the json files, and I can't see where this information is given.

it's chosen automatically, depending on your settings. For example:

return MusicgenForConditionalGeneration.from_pretrained('Xenova/musicgen-small', {
                                progress_callback: (progress_data) => {
                                    console.log("MUSICGEN WORKER: model download progress_callback: progress_data: ", progress_data);
                                  	if (progress_data.status !== 'progress') return;
                                    self.postMessage(progress_data);
                                },
                                dtype: {
                                  text_encoder: 'q8',
                                  decoder_model_merged: 'q8',
                                  encodec_decode: 'fp32',
                                },
                                device: 'wasm',
                    		});

You could do it manually if you really wanted to...

let onnx_file_name = 'model_fp16.onnx';
    if(bits == 32){
        onnx_file_name = 'model.onnx';
    }
    if(bits == 8){
        onnx_file_name = 'model_int8.onnx';
    }
    
    const session = await ort.InferenceSession.create('models/Xenova/all-MiniLM-L6-v2/onnx/' + onnx_file_name, { executionProviders: ['webgpu'], log_severity_level: 0 });
    const tokenizer = await AutoTokenizer.from_pretrained('Xenova/all-MiniLM-L6-v2');

it's chosen automatically, depending on your settings. For example:

Thanks a lot to take the time, but I don't get it.

Automatically ? What do you mean, nothing is magic in computer, so automatically, by who ? The documentation said it's quantitized by default, but I have 3 onnx quantized file in the folder so ?

What do you mean by my settings? Is is this code below ? (but in my code I don't have this type of settings)

                                dtype: {
                                  text_encoder: 'q8',
                                  decoder_model_merged: 'q8',
                                  encodec_decode: 'fp32',
                                },

So to help me understand, can you give me an example with this model opus-me-en-es, and the code below ? And tell me what onnx file will be picked ?

import { pipeline } from '@xenova/transformers';

const generator = await pipeline('translation', 'Xenova/opus-mt-en-es');
const output = await generator('How are you?', {
    top_k: 0,
    do_sample: false,
    num_beams: 1,
});
console.log(output);
// [{ translation_text: '¿Cómo estás?' }]

Thanks !

tell me what onnx file will be picked ?

I'm not sure, but you can check in the developer console of your browser (network tab) which files get downloaded?

so automatically, by who

By Transformers.js...

but in my code I don't have this type of settings

Mine normally also doesn't. I'm just saying you can 'choose' a quantization level by playing around with dtype.

Hope that helps!

It helped a lot, and it's working now. However, can you give me an example code for angular, we have some trouble to port it on our angular app ?

Rgds.

Sign up or log in to comment