shalaka-thorat/movie-genre-prediction-contest

Model Details

Model Description

The goal of the competition is to design a predictive model that accurately classifies movies into their respective genres based on their titles and synopses.

The model takes in inputs such as movie_name and synopsis as a whole string and outputs the predicted genre of the movie.

Developed by: [Shalaka Thorat]
Shared by: [Data Driven Science- Movie Genre Prediction Contest: competitions/movie-genre-prediction]
Language: [Python]
Tags: [Python, NLP, Sklearn, NLTK, Machine Learning, Multi-class Classification, Supervised Learning]

Model Sources

Repository: [competitions/movie-genre-prediction]

Training Details

We have used Multinomial Naive Bayes Algorithm to work well with Sparse Vectorized data, which consists of movie_name and synopsis. The output of the model is a class (out of 10 classes) of the genre.

Training Data

All the Training and Test Data can be found here:

[competitions/movie-genre-prediction]

Preprocessing

Label Encoding
Tokenization
TF-IDF Vectorization
Preprocessing of digits, special characters, symbols, extra spaces and stop words from textual data

Evaluation

The evaluation metric used is [Accuracy] as specified in the competition.