label_map = {'True': 0, 'False': 1, 'Invalid input': 2}

yes_no_model_english

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.0002

Model description

More information needed

Intended uses & limitations

from transformers import GPT2Tokenizer, GPT2ForSequenceClassification, Trainer, TrainingArguments

# Replace 'your-username/your-model-name' with the actual model identifier
model_id = 'tuskbyte/yes_no_model_english'
label_map=["Yes","NO","Invalid Input"]
# label_map = {'True': 0, 'False': 1, 'Invalid input': 2}

# Load the model
model = AutoModelForSequenceClassification.from_pretrained(model_id)

try:
    # Try to load the tokenizer
    tokenizer = AutoTokenizer.from_pretrained(model_id)
except OSError:
    # Fallback to a default tokenizer if loading fails
    print(f"Tokenizer for '{model_id}' not found. Using  gpt as fallback.")
    tokenizer = GPT2Tokenizer.from_pretrained('gpt2')

# Initialize Trainer with dummy arguments for inference
training_args = TrainingArguments(
    output_dir='./results',  # specify your output directory
    per_device_eval_batch_size=1  # batch size for inference
)

trainer = Trainer(
    model=model,
    args=training_args,
    tokenizer=tokenizer
)

# Example input
question = "Would you like to paticipate ?"
answer = "yes i would"
input_text = f"{question} {answer}"

# Tokenize the input
inputs = tokenizer(input_text, return_tensors="pt")
model.to('cuda')
inputs.to('cuda')
# Perform inference using the model
outputs = model(**inputs)
logits = outputs.logits

# Get the predicted label
predicted_class_id = logits.argmax().item()
print("predicted_class_id",predicted_class_id)
labels = model.config.id2label
print("labels",labels)
predicted_label = labels[predicted_class_id]

# Output the result
print(f"Predicted label: {predicted_label}")
print(f"Model predection is : {label_map[predicted_class_id]}")

support english only

Training procedure

upcomming soon

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 10
eval_batch_size: 10
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 50
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss
1.2072	0.2857	10	1.0470
1.0909	0.5714	20	0.7972
0.8701	0.8571	30	0.5695
0.5525	1.1429	40	0.2802
0.2131	1.4286	50	0.0569
0.0454	1.7143	60	0.0093
0.0144	2.0	70	0.0012
0.0016	2.2857	80	0.0003
0.0006	2.5714	90	0.0002
0.0006	2.8571	100	0.0002

Framework versions

Transformers 4.41.2
Pytorch 2.1.2
Datasets 2.19.2
Tokenizers 0.19.1

tuskbyte
/

yes_no_model_english