Edit model card

Model Details

Model Description

The goal of the competition is to design a predictive model that accurately classifies movies into their respective genres based on their titles and synopses.

The model takes in inputs such as movie_name and synopsis as a whole string and outputs the predicted genre of the movie.

  • Developed by: [Shalaka Thorat]
  • Shared by: [Data Driven Science- Movie Genre Prediction Contest: competitions/movie-genre-prediction]
  • Language: [Python]
  • Tags: [Python, NLP, Sklearn, NLTK, Machine Learning, Multi-class Classification, Supervised Learning]

Model Sources

  • Repository: [competitions/movie-genre-prediction]

Training Details

We have used Multinomial Naive Bayes Algorithm to work well with Sparse Vectorized data, which consists of movie_name and synopsis. The output of the model is a class (out of 10 classes) of the genre.

Training Data

All the Training and Test Data can be found here:



  1. Label Encoding
  2. Tokenization
  3. TF-IDF Vectorization
  4. Preprocessing of digits, special characters, symbols, extra spaces and stop words from textual data


The evaluation metric used is [Accuracy] as specified in the competition.

Downloads last month
Unable to determine this model’s pipeline type. Check the docs .