phi-2-chatml / README.md
iliyaML's picture
Updated readme - Added new special tokens and resized embedding and final output layer
e1b4100 verified
metadata
library_name: transformers
license: mit

Model Card for Model ID

Based on: https://huggingface.co/microsoft/phi-2

Summary of changes made:

  1. Add new special tokens for padding ([PAD]) and ChatML tokens (<|im_start|>, <|im_start|>) for further finetuning on instruction/chat datasets
  2. Resize embedding layer and final output layer

Code for Reproducibility

import torch
import transformers

transformers.set_seed(42)
torch.set_default_device("cuda")

model_checkpoint = "microsoft/phi-2"
tokenizer = transformers.AutoTokenizer.from_pretrained(model_checkpoint)
model = transformers.AutoModelForCausalLM.from_pretrained(model_checkpoint, torch_dtype=torch.float16, trust_remote_code=True)

num_added_tokens = tokenizer.add_special_tokens({'additional_special_tokens': ['<|im_start|>', '<|im_end|>'], 'pad_token': '[PAD]'})
model.resize_token_embeddings(len(tokenizer))