The attention mask and the pad token id were not set.

#64
by victor314159 - opened

I'm brand new to AI. So not familiar with all the concepts yet. Still, my minimal chat program is very simple, and I am already getting a worrying warning, even if the model seems to work.

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

device = "cuda"  # The device to load the model onto

modelName = "../Mistral-7B-Instruct-v0.1"

model = AutoModelForCausalLM.from_pretrained(modelName, device_map="auto", torch_dtype=torch.float16) #load in fp16 to fit on a RTX4090
tokenizer = AutoTokenizer.from_pretrained(modelName)

# Initialize an empty conversation history
conversation_history = []

# Define a function to generate responses
def generate_response(input_text, model, tokenizer, device, conversation_history):

    messages = conversation_history + [{"role": "user", "content": input_text}]

    encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")
    model_inputs = encodeds.to(device)
    generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
    decoded = tokenizer.batch_decode(generated_ids)
    return decoded[0]

while True:
    user_input = input("You: ")
    if user_input.lower() == "exit":
        print("Chatbot: Goodbye!")
        break
    response = generate_response(user_input, model, tokenizer, device, conversation_history)
    print("Chatbot:", response[response.rfind("[/INST]") + len("[/INST]"):response.rfind("</s>") ])

    # Update the conversation history with the user's input and the bot's response
    conversation_history.append({"role": "user", "content": user_input})
    conversation_history.append({"role": "assistant", "content": response})

I am getting the following warning:
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results. Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.

Am I missing something?

You need to use either the AutoConfig or MistralConfig libraries to set the configuration details.

I tried to load config and create model from config, while the config is not cooperated with device_map:"auto" feature, even I put it in the config.json file, so cannot load the models in two GPUs automatically

Sign up or log in to comment