base_model_prefix = "transformer"

#265

by Cyrile - opened Oct 2, 2023

Oct 2, 2023

Hello, why doesn't the nomenclature of the modules in the Bloom and Bloomz models adhere to those created by the BloomPreTrainedModel class: base_model_prefix = "transformer"?
The issue is that in TGI, which has adapted to Bloom modeling, models trained by Transformers do not work because the TGI library looks for model names without the "transformer" prefix.

lysandre

Oct 2, 2023

WDYT @Narsil ?

NolwennO

Oct 2, 2023

FYI, here is the related issue description https://github.com/huggingface/text-generation-inference/issues/541#issuecomment-1740913948

Narsil

BigScience Workshop org Oct 3, 2023

Well those foundation model work.

If loading the model and saving it back in transformers changes it that's an issue IMO.

We can make something for TGI but this feels like legacy support, would you agree ?

Narsil

BigScience Workshop org Oct 3, 2023

This should fix it: https://github.com/huggingface/text-generation-inference/pull/1090

Cyrile

Oct 4, 2023

•

edited Oct 4, 2023

Hi Narsil,
I agree that the issue seems to be more related to the naming of modules in the foundation models rather than a TGI problem. What I find strange is that the planned prefix in the code is "transformer.[PyTorch module name]," but in the foundation model, this prefix is absent.
If I refer to the BERT model, for example, there is the prefix "bert.[etc]" on the module names, as stipulated in the code: base_model_prefix = "bert".

Cyrile

Oct 4, 2023

Indeed, allowing flexibility in TGI to let the user define the prefix would be a more robust solution.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment