Is it possible to export unixcoder to ONNX format?

by Lyriccoder - opened Jul 22, 2022

Jul 22, 2022

•

edited Jul 22, 2022

If I try to follow the guide (https://huggingface.co/docs/transformers/main/serialization) and try to export the unixcoder model, I see the errors:

    raise RepositoryNotFoundError(
transformers.utils.hub.RepositoryNotFoundError: 401 Client Error: Repository not found for url: https://huggingface.co/unixcoder-base/resolve/main/config.json. If the repo is private, make sure you are authenticated.
OSError: unixcoder-base is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'

I tried to export model with local files (located in https://huggingface.co/microsoft/unixcoder-base/tree/main),
but I also get the following error:

    raise ValueError(
ValueError: Unrecognized processor in unixcoder-base. Should have a `processor_type` key in its preprocessor_config.json, or one of the following `model_type` keys in its config.json: clip, flava, layoutlmv2, layoutlmv3, layoutxlm, sew, sew-d, speech_to_text, speech_to_text_2, trocr, unispeech, unispeech-sat, vilt, vision-text-dual-encoder, wav2vec2, wav2vec2-conformer, wav2vec2_with_lm, wavlm

Could you please tell me whether it is possible to export the unixcoder model to ONNX?
If no, could you please make such an opportunity?
I am trying to reduce the inference time for GPU of unixcoder model (I have an own checkpoint)

nielsr

Jul 24, 2022

Cc @regisss

Lyriccoder

Aug 23, 2022

Hi guys, do u have any updates?
@nielsr

nielsr

Aug 23, 2022

Hi @Lyriccoder ,

The reason you get the error is because you just need to provide the model name, rather than the URL. The following works for me in Google Colab:

!pip install -q transformers[onnx]
!python -m transformers.onnx --model=microsoft/unixcoder-base onnx/ --atol 1e-4

Lyriccoder

Aug 24, 2022

•

edited Aug 24, 2022

First, I am talking about model with fine-tuning for code summarization (folder code-summarization).
Secondly, I tried to export already fine-tuned Unixcoder model (Seq2Seq, I have a saved checkpoint) for code summarization (not the original model).
Does it happen just because Seq2Seq with Unixcoder encoder-decoder model is not supported?

nielsr

Aug 24, 2022

Do you mean that your model is an instance of the EncoderDecoderModel class?

Lyriccoder

Aug 24, 2022

I trained your model, located here:

https://github.com/microsoft/CodeBERT/blob/master/UniXcoder/downstream-tasks/code-summarization/model.py. I didn't change that model.

I trained it successfully, I have a checkpoint. Onnx supports loading models from checkpoints (not only officially deployed in huggingface).
But when I try to load a load checkpoint with ONNX (https://github.com/microsoft/CodeBERT/blob/master/UniXcoder/downstream-tasks/code-summarization/model.py), I have the error:

 raise ValueError(
ValueError: Unrecognized processor in unixcoder-base. Should have a `processor_type` key in its preprocessor_config.json, or one of the following `model_type` keys in its config.json: clip, flava, layoutlmv2, layoutlmv3, layoutxlm, sew, sew-d, speech_to_text, speech_to_text_2, trocr, unispeech, unispeech-sat, vilt, vision-text-dual-encoder, wav2vec2, wav2vec2-conformer, wav2vec2_with_lm, wavlm

nielsr

Aug 24, 2022

Ok I see. That custom Seq2Seq class isn't supported by the ONNX tools that HuggingFace provides (which only include models available in the Transformers library).

So this would require a custom implementation. Alternatively (and this is what I'd recommend), is to fine-tune an EncoderDecoderModel class, warm-started with the weights of microsoft/unixcoder-base for both the encoder and decoder, on a code summarization dataset. This is also what the UnixCoder authors did as seen here.

We're planning to add ONNX support for that class soon.

Lyriccoder

Aug 24, 2022

•

edited Aug 24, 2022

Oh, got it. It's not an issue of Unixcoder. I am looking forward for that class.
If it is possible, could you please make documentation with example for that case when ONNX will support the mentioned feature in future?

Thank you for your answer.

Lyriccoder changed discussion status to closed Aug 25, 2022

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment