Model Card for MLOps Emotion Classifier

Model Details

Model Description

This model is a fine-tuned version of distilbert-base-uncased trained on a small subset of the emotion dataset. It was developed specifically to fulfill the requirements of an End-to-End MLOps Pipeline academic assignment, demonstrating model fine-tuning, W&B tracking, Docker containerization, and GitHub Actions automation.

Developed by: Group 15
Funded by: Academic Assignment (PGD AI Program, IIT Jodhpur)
Model type: Text Classification
Language(s) (NLP): English
License: MIT License
Finetuned from model: distilbert-base-uncased

Model Sources

Repository: https://github.com/g25ait2122/mlops-pipeline-project

Uses

Direct Use

This model is intended solely for educational purposes to demonstrate a working MLOps pipeline. It classifies English text into six basic emotions (sadness, joy, love, anger, fear, surprise).

Out-of-Scope Use

Because it was trained on a severely reduced dataset to meet Kaggle's free GPU time limits, this model is not suitable for any real-world or production applications.

Bias, Risks, and Limitations

The model inherits biases from the base DistilBERT model and the subset of the dair-ai/emotion dataset. Accuracy is intentionally compromised for the sake of pipeline speed.

How to Get Started with the Model

Use the code below to test the inference:

from transformers import pipeline

classifier = pipeline("text-classification", model="VaibhavG25AIT2122/mlops-emotion-classifier")
print(classifier("I am feeling very happy today!"))

Training Details

Training Data

Trained on a 2,000-sample subset of the dair-ai/emotion dataset to ensure fast training times on free compute.

Training Procedure

Preprocessing

Null values were dropped and all text was lowercased. Tokenized using DistilBertTokenizer with padding="max_length" and truncation=True.

Training Hyperparameters

Training regime: fp32
Learning rate: 5e-5
Epochs: 2
Batch size: 16

Evaluation

Testing Data, Factors & Metrics

Evaluated on a 500-sample validation subset using Accuracy and weighted F1-score.

Results

Accuracy: ~82%
F1 Score: ~0.81

Environmental Impact

Hardware Type: GPU T4 x2
Hours used: < 3 hours
Cloud Provider: Kaggle Notebooks

Technical Specifications

Software

transformers 4.38.2
torch 2.2.1
wandb
datasets

Model Card Contact

Group 15 - Created for MLOps Assignment

Downloads last month: 79

Safetensors

Model size

67M params

Tensor type

F32

Model tree for VaibhavG25AIT2122/mlops-emotion-classifier

Base model

distilbert/distilbert-base-uncased

Finetuned

(11926)

this model

VaibhavG25AIT2122
/

mlops-emotion-classifier