Adding Special Tokens for Training

#10
by realtimeriddle - opened

Is there a good way to add special tokens when training? I get an error when I use the resize_token_embedding() function because this model uses FrozenBNBEmbeddings. I already have tried recreating the function with the frozen embeddings in mind but I'm not sure that can work with this kind of model.

Please note: this code has been deprecated for 4 months now. See README for the updated version.

  1. you can look inside the existing tokenizer - there are already some ~unused tokens that can be reused as special tokens
  2. you can resize embeddings in the original model, then run the quantization notebook to get its 8-bit version (see README) . This will take some effort

However, the newly added tokens will not be trained because their embeddings are frozen. You can also implement a custom forward code where the new tokens will be stored in a separate torch.nn.Embedding layer which is not quantized, and hence, fully trainable. In that case, you will need to slightly modify the existing forward pass code - which is something that you will need to figure out by yourself (see note above).

Yes, resizing the original model and then running the quantization notebook does seem to work.
Also, the new token are not being trained. Thank you for clarifying.

can we use <|extratoken_1|>, <|extratoken_2|>, etc.?

Maybe? I tried that back in October and I forget how it went. In any case, justheuristic was correct, the model was depreciated at the time and I think the never version of the transformers library has a better solution for training 8-bit models now anyways.

realtimeriddle changed discussion status to closed

Sign up or log in to comment