YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

distilbert-goodreads-genres_v2

This model is a fine-tuned version of distilbert-base-cased on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 4.8567

library_name: transformers tags: - text-classification - distilbert - goodreads datasets: - ucsd_goodreads metrics: - accuracy - f1 - loss

distilbert-goodreads-genres_v2

Model Details

  • Developed by: Duggirala Vnaga Ananth (G25AIT2032)
  • Institution: IIT Jodhpur | PGD AI Programme
  • Model type: Transformer-based Text Classification
  • Language(s): English
  • Finetuned from model: distilbert-base-cased

MLOps Pipeline Links

Model Description

This model is a fine-tuned version of distilbert-base-cased designed to classify book reviews into seven distinct genres: Poetry, Comics & Graphic, Fantasy & Paranormal, History & Biography, Mystery/Thriller/Crime, Romance, and Young Adult.

This v2 iteration focused on testing model limits via extended training epochs (10) and Bayesian-inspired hyperparameter adjustments to explore the trade-off between training convergence and validation generalization.

Intended Uses & Limitations

Intended Use

  • Automated categorization of literary reviews.
  • Baseline for genre-specific sentiment or thematic analysis.

Limitations & Observations (MLOps Critical Analysis)

  • Significant Overfitting: As per the training logs, the training loss reached a near-perfect 0.1098, while validation loss increased to 4.8567. This indicates the model has memorized the training set.
  • Model Rewind: To ensure the most usable version was deployed, the load_best_model_at_end flag was used. The final weights represent the state at Epoch 2 (Validation Loss: 2.1895).
  • Genre Bias: Inference testing reveals a bias toward the "Romance" and "Poetry" labels for ambiguous text, likely due to linguistic overlaps in the 800-sample balanced training sets.

Training Procedure

Training Data

  • Dataset: UCSD Goodreads Book Graph.
  • Size: 5,600 training samples (800 per genre, perfectly balanced).
  • Validation: 1,400 samples (200 per genre).

Hyperparameters

  • Learning Rate: 5e-05
  • Batch Size: 16 (Train/Eval)
  • Optimizer: AdamW (Fused)
  • Epochs: 10
  • Weight Decay: 0.01

Training Results (v2)

Epoch Step Training Loss Validation Loss
1 350 No log 2.1436
2 700 2.4290 2.1895 (Best)
5 1750 0.6660 3.4322
10 3500 0.1098 4.8567

Environmental Impact

  • Hardware: NVIDIA T4 Tensor Core GPU
  • Compute Provider: Google Cloud Platform (via Kaggle/Colab)
  • Carbon Emitted: < 0.01 kg CO2eq (Estimated using MLCO2 Impact Tracker)

Technical Specifications

  • Frameworks: Transformers 5.0.0, PyTorch 2.10.0+cu128, Datasets 4.8.3
  • Infrastructure: Modularized Python scripts (data.py, train.py) with integrated wandb logging and huggingface_hub syncing.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 350 2.1436
2.4290 2.0 700 2.1895
1.3851 3.0 1050 2.2901
1.3851 4.0 1400 2.9491
0.6660 5.0 1750 3.4322
0.3433 6.0 2100 4.3519
0.3433 7.0 2450 4.5286
0.2059 8.0 2800 4.7578
0.1519 9.0 3150 4.8699
0.1098 10.0 3500 4.8567

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.8.3
  • Tokenizers 0.22.2

Citation

@article{sanh2019distilbert,
  title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},
  author={Sanh, Victor and Debut, Lysandre and Chaumond, Adrien and Wolf, Thomas},
  journal={arXiv preprint arXiv:1910.01108},
  year={2019}
}
Downloads last month
292
Safetensors
Model size
65.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for nagaananth/distilbert-goodreads-genres_v2