base_model_prefix = "transformer"

#265
by Cyrile - opened

Hello, why doesn't the nomenclature of the modules in the Bloom and Bloomz models adhere to those created by the BloomPreTrainedModel class: base_model_prefix = "transformer"?
The issue is that in TGI, which has adapted to Bloom modeling, models trained by Transformers do not work because the TGI library looks for model names without the "transformer" prefix.

BigScience Workshop org

Well those foundation model work.

If loading the model and saving it back in transformers changes it that's an issue IMO.

We can make something for TGI but this feels like legacy support, would you agree ?

BigScience Workshop org

Hi Narsil,
I agree that the issue seems to be more related to the naming of modules in the foundation models rather than a TGI problem. What I find strange is that the planned prefix in the code is "transformer.[PyTorch module name]," but in the foundation model, this prefix is absent.
If I refer to the BERT model, for example, there is the prefix "bert.[etc]" on the module names, as stipulated in the code: base_model_prefix = "bert".

Indeed, allowing flexibility in TGI to let the user define the prefix would be a more robust solution.

Sign up or log in to comment