Edit model card

label_map = {'True': 0, 'False': 1, 'Invalid input': 2}

yes_no_model_english

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0002

Model description

More information needed

Intended uses & limitations

from transformers import GPT2Tokenizer, GPT2ForSequenceClassification, Trainer, TrainingArguments

# Replace 'your-username/your-model-name' with the actual model identifier
model_id = 'tuskbyte/yes_no_model_english'
label_map=["Yes","NO","Invalid Input"]
# label_map = {'True': 0, 'False': 1, 'Invalid input': 2}

# Load the model
model = AutoModelForSequenceClassification.from_pretrained(model_id)

try:
    # Try to load the tokenizer
    tokenizer = AutoTokenizer.from_pretrained(model_id)
except OSError:
    # Fallback to a default tokenizer if loading fails
    print(f"Tokenizer for '{model_id}' not found. Using  gpt as fallback.")
    tokenizer = GPT2Tokenizer.from_pretrained('gpt2')

# Initialize Trainer with dummy arguments for inference
training_args = TrainingArguments(
    output_dir='./results',  # specify your output directory
    per_device_eval_batch_size=1  # batch size for inference
)

trainer = Trainer(
    model=model,
    args=training_args,
    tokenizer=tokenizer
)

# Example input
question = "Would you like to paticipate ?"
answer = "yes i would"
input_text = f"{question} {answer}"

# Tokenize the input
inputs = tokenizer(input_text, return_tensors="pt")
model.to('cuda')
inputs.to('cuda')
# Perform inference using the model
outputs = model(**inputs)
logits = outputs.logits

# Get the predicted label
predicted_class_id = logits.argmax().item()
print("predicted_class_id",predicted_class_id)
labels = model.config.id2label
print("labels",labels)
predicted_label = labels[predicted_class_id]

# Output the result
print(f"Predicted label: {predicted_label}")
print(f"Model predection is : {label_map[predicted_class_id]}")
support english only

Training procedure

upcomming soon

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 10
  • eval_batch_size: 10
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 50
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
1.2072 0.2857 10 1.0470
1.0909 0.5714 20 0.7972
0.8701 0.8571 30 0.5695
0.5525 1.1429 40 0.2802
0.2131 1.4286 50 0.0569
0.0454 1.7143 60 0.0093
0.0144 2.0 70 0.0012
0.0016 2.2857 80 0.0003
0.0006 2.5714 90 0.0002
0.0006 2.8571 100 0.0002

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.1.2
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
2
Safetensors
Model size
124M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.