Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

raccord/scenAIrio-classification

Model Description

The scenAIrio-classification-model is designed to classify parts of a movie script or scenario into one of three categories: NOTES, DIALOGUE, or SEQUENCE. It leverages a BERT transformer architecture to understand and classify text based on contextual nuances typical in scripts.

Intended Use

This model is intended for use in applications involving the processing and analysis of movie scripts or scenarios. It can help scriptwriters, editors, and directors to automatically categorize script segments, facilitating easier script breakdowns and edits.

Training Data

The model was trained on a dataset consisting of annotated movie scripts. Each part of the script was labeled as NOTES, DIALOGUE, or SEQUENCE.

Training Procedure

The model was trained using the following training arguments:

  • Output Directory: ./scenAIrio-modal
  • Training: Enabled
  • Evaluation: Enabled
  • Epochs: 3
  • Training Batch Size per Device: 16
  • Evaluation Batch Size per Device: 32
  • Warmup Steps: 100
  • Weight Decay: 0.01
  • Logging: Every 50 steps to ./multi-class-logs
  • Evaluation Strategy: Every 50 steps
  • Save Strategy: Save checkpoints every 50 steps
  • Best Model Loading: At the end of training, the best performing model is loaded

Model Architecture

The model is based on a BERT transformer, specifically adapted for multi-class classification tasks.

Evaluation Results

Phase Loss Accuracy F1-Score Precision Recall
Val 0.21253 93.73% 95.37% 95.53% 95.24%
Train 0.08378 97.94% 98.47% 98.56% 98.39%
Test 0.26723 91.59% 93.49% 93.17% 93.84%

Limitations

  • The model is specifically trained on French-language scripts and may not perform well with scripts in other languages.
  • Performance can vary significantly depending on the specific characteristics and formatting of the input scripts.

Conclusion

The scenAIrio-classification-model provides a robust tool for analyzing and categorizing parts of movie scripts. With high accuracy and precision, it is poised to be a valuable asset in the film and television industry.

Downloads last month
62
Safetensors
Model size
125M params
Tensor type
F32
·

Dataset used to train Raccord/scenAIrio