The goal of the competition is to design a predictive model that accurately classifies movies into their respective genres based on their titles and synopses.
The model takes in inputs such as movie_name and synopsis as a whole string and outputs the predicted genre of the movie.
- Developed by: [Shalaka Thorat]
- Shared by: [Data Driven Science- Movie Genre Prediction Contest: competitions/movie-genre-prediction]
- Language: [Python]
- Tags: [Python, NLP, Sklearn, NLTK, Machine Learning, Multi-class Classification, Supervised Learning]
- Repository: [competitions/movie-genre-prediction]
We have used Multinomial Naive Bayes Algorithm to work well with Sparse Vectorized data, which consists of movie_name and synopsis. The output of the model is a class (out of 10 classes) of the genre.
All the Training and Test Data can be found here:
- Label Encoding
- TF-IDF Vectorization
- Preprocessing of digits, special characters, symbols, extra spaces and stop words from textual data
The evaluation metric used is [Accuracy] as specified in the competition.
- Downloads last month