distilbert-goodreads-genres_v2

This model is a fine-tuned version of distilbert-base-cased on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 4.8567

library_name: transformers tags: - text-classification - distilbert - goodreads datasets: - ucsd_goodreads metrics: - accuracy - f1 - loss

distilbert-goodreads-genres_v2

Model Details

Developed by: Duggirala Vnaga Ananth (G25AIT2032)
Institution: IIT Jodhpur | PGD AI Programme
Model type: Transformer-based Text Classification
Language(s): English
Finetuned from model: distilbert-base-cased

MLOps Pipeline Links

GitHub Repository: g25ait2032-prog/nagaananth
Experiment Tracking (W&B): View Final Run & Artifacts
Hugging Face Model Hub: nagaananth/distilbert-goodreads-genres_v2

Model Description

This model is a fine-tuned version of distilbert-base-cased designed to classify book reviews into seven distinct genres: Poetry, Comics & Graphic, Fantasy & Paranormal, History & Biography, Mystery/Thriller/Crime, Romance, and Young Adult.

This v2 iteration focused on testing model limits via extended training epochs (10) and Bayesian-inspired hyperparameter adjustments to explore the trade-off between training convergence and validation generalization.

Intended Uses & Limitations

Intended Use

Automated categorization of literary reviews.
Baseline for genre-specific sentiment or thematic analysis.

Limitations & Observations (MLOps Critical Analysis)

Significant Overfitting: As per the training logs, the training loss reached a near-perfect 0.1098, while validation loss increased to 4.8567. This indicates the model has memorized the training set.
Model Rewind: To ensure the most usable version was deployed, the load_best_model_at_end flag was used. The final weights represent the state at Epoch 2 (Validation Loss: 2.1895).
Genre Bias: Inference testing reveals a bias toward the "Romance" and "Poetry" labels for ambiguous text, likely due to linguistic overlaps in the 800-sample balanced training sets.

Training Procedure

Training Data

Dataset: UCSD Goodreads Book Graph.
Size: 5,600 training samples (800 per genre, perfectly balanced).
Validation: 1,400 samples (200 per genre).

Hyperparameters

Learning Rate: 5e-05
Batch Size: 16 (Train/Eval)
Optimizer: AdamW (Fused)
Epochs: 10
Weight Decay: 0.01

Training Results (v2)

Epoch	Step	Training Loss	Validation Loss
1	350	No log	2.1436
2	700	2.4290	2.1895 (Best)
5	1750	0.6660	3.4322
10	3500	0.1098	4.8567

Environmental Impact

Hardware: NVIDIA T4 Tensor Core GPU
Compute Provider: Google Cloud Platform (via Kaggle/Colab)
Carbon Emitted: < 0.01 kg CO2eq (Estimated using MLCO2 Impact Tracker)

Technical Specifications

Frameworks: Transformers 5.0.0, PyTorch 2.10.0+cu128, Datasets 4.8.3
Infrastructure: Modularized Python scripts (data.py, train.py) with integrated wandb logging and huggingface_hub syncing.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	350	2.1436
2.4290	2.0	700	2.1895
1.3851	3.0	1050	2.2901
1.3851	4.0	1400	2.9491
0.6660	5.0	1750	3.4322
0.3433	6.0	2100	4.3519
0.3433	7.0	2450	4.5286
0.2059	8.0	2800	4.7578
0.1519	9.0	3150	4.8699
0.1098	10.0	3500	4.8567

Framework versions

Transformers 5.0.0
Pytorch 2.10.0+cu128
Datasets 4.8.3
Tokenizers 0.22.2

Citation

@article{sanh2019distilbert,
  title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},
  author={Sanh, Victor and Debut, Lysandre and Chaumond, Adrien and Wolf, Thomas},
  journal={arXiv preprint arXiv:1910.01108},
  year={2019}
}

Downloads last month: 292

Safetensors

Model size

65.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for nagaananth/distilbert-goodreads-genres_v2

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 23