Why the Decoder_start_token_id has to be defined in "google/switch-base-32" or "google/switch-base-64" ??

#5
by Karim-Gamal - opened

I was trying the official notebook ( https://colab.research.google.com/drive/1aGGVHZmtKmcNBbAwa9hbu58DDpIuB5O4?usp=sharing ) for the MoE Switch transformer but with "base-32".

image.png

ValueError: self.model.config.decoder_start_token_id has to be defined. In SwitchTransformers it is usually set to the pad_token_id.

However i don't face this problem with "base-8" or "base-16".

Thanks for the report. This is a bug as the decoder_start_token_id has been forgotten for this model. This should be now fixed: https://huggingface.co/google/switch-base-32/blob/main/config.json#L19

Thanks for your quick response. πŸ˜€

Sign up or log in to comment