Update Readme.md

ac1a611 verified 4 months ago

3.24 kB

metadata

library_name: peft
license: mit
base_model: FacebookAI/roberta-base
tags:
  - generated_from_trainer
metrics:
  - accuracy
  - f1 (macro)
  - top3 accuracy
model-index:
  - name: roberta-base-with-tweet-eval-emoji-full
    results: []

Model description

This model is a fine-tuned version of roberta-base on the tweet_eval/emoji dataset. It achieves the following results on the evaluation set:

Loss: 1.9516
Accuracy: 0.4252
F1: 0.3314
Top3 Accuracy: 0.6504

Example of classification

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
from peft import PeftModel

# Specify the same model ID pushed or trained locally
MODEL_ID = "roberta-base-tweet-emoji-lora"

# Load tokenizer + model
device = 0 if torch.cuda.is_available() else -1
tok = AutoTokenizer.from_pretrained(MODEL_ID)
base_model = AutoModelForSequenceClassification.from_pretrained(
    "FacebookAI/roberta-base",
    num_labels=20,
    ignore_mismatched_sizes=True,
)
model = PeftModel.from_pretrained(base_model, MODEL_ID).eval()

pipe = pipeline(
    task="text-classification",
    model=model,
    tokenizer=tok,
    return_all_scores=True,
    function_to_apply="softmax",
    device=device,
)

# Map label IDs to emojis
id2label = {
    0: "❤",
    1: "😍",
    2: "😂",
    3: "💕",
    4: "🔥",
    5: "😊",
    6: "😎",
    7: "✨",
    8: "💙",
    9: "😘",
    10: "📷",
    11: "🇺🇸",
    12: "☀",
    13: "💜",
    14: "😉",
    15: "💯",
    16: "😁",
    17: "🎄",
    18: "📸",
    19: "😜",
}


def predict_emojis(text, top_k=2):
    """
    Predict top k emojis for the given text.

    Args:
        text (str): Input string.
        k (int): Number of top emojis to return.

    Returns:
        str: Space-separated top k emojis.
    """
    probs = pipe(text)[0]
    top = sorted(probs, key=lambda x: x["score"], reverse=True)[:top_k]
    return " ".join(id2label[int(d["label"].split("_")[-1])] for d in top)

print(predict_emojis("Sunny days!"))

Output:

😎 ☀

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 256
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.05
num_epochs: 4
mixed_precision_training: Native AMP
label_smoothing_factor: 0.1

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1 (Macro)	Top3 Accuracy
2.1291	1.0	352	2.4306	0.219	0.2126	0.405
2.1083	2.0	704	2.3812	0.2356	0.2411	0.43
2.0068	3.0	1056	2.3611	0.2456	0.2492	0.4442
1.9326	4.0	1408	2.3694	0.2524	0.2536	0.4472

Framework versions

PEFT 0.14.0
Transformers 4.51.3
Pytorch 2.6.0+cu124
Datasets 3.6.0
Tokenizers 0.21.1