getting errors while training the model with few fine tuning modification.

#1
by DivyanshuDaftari - opened

Can someone explain how and what exactly is to be doe to get this up and running. Because everytime i try training the model, i get this value error. -> "ValueError: The model did not return a loss from the inputs, only the following keys: last_hidden_state,pooler_output. For reference, the inputs it received are input_ids,token_type_ids,attention_mask."

Hi, what kind of fine-tuning task are you trying? And how are you initializing the model?
It would be good if you could provide a snippet of your code

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("law-ai/InLegalBERT")

X_train_set = list(train_set['text'])
y_train_set = list(train_set['label'])
X_test_set = list(test_set['text'])
y_test_set = list(test_set['label'])
X_validation_set = list(validation_set['text'])
y_validation_set = list(validation_set['label'])

train_encoded_input = tokenizer(X_train_set, return_tensors="pt", truncation=True, padding=True)
test_encoded_input = tokenizer(X_test_set, return_tensors="pt", truncation=True, padding=True)

class Dataset(torch.utils.data.Dataset):
def init(self, encodings, labels=None):
self.encodings = encodings
self.labels = labels

def __getitem__(self, idx):
    item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()}
    if self.labels:
        item["labels"] = torch.tensor(self.labels[idx]-1)
    return item

def __len__(self):
    return len(self.encodings["input_ids"])

train_dataset = Dataset(train_encoded_input, y_train_set)
test_dataset = Dataset(test_encoded_input, y_test_set)

model = AutoModel.from_pretrained("law-ai/InLegalBERT")

from transformers import TrainingArguments
training_args = TrainingArguments(output_dir="test_trainer")

from transformers import TrainingArguments, Trainer
training_args = TrainingArguments(output_dir="test_trainer", evaluation_strategy="epoch", num_train_epochs=5)

trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=test_dataset,
compute_metrics=compute_metrics,
)
trainer.train()

AutoModel.from_pretrained() gives you the bare BERT model (without any classification heads). Consequently, the model returns the last hidden state of BERT and a pooler output, which is constructed from the embedding of the [CLS] token.

The trainer however requires a model that will return a loss. For this, you need to add a head on top of the bare BERT model. It might be possible to do this by using
AutoModelForSequenceClassification.from_pretrained(), which returns a randomly initialized sequence classification head on top of the bare BERT model, and then you might possibly use it directly. However, I need to check this to confirm. You can check it too, meanwhile.

Sign up or log in to comment