shalaka-thorat commited on
Commit
f19a40c
·
1 Parent(s): 3b4cadf

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -0
README.md ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ metrics:
5
+ - accuracy
6
+ tags:
7
+ - sklearn
8
+ - machine learning
9
+ - movie-genre-prediction
10
+ - multi-class classification
11
+ ---
12
+
13
+ ## Model Details
14
+
15
+ ### Model Description
16
+
17
+ The goal of the competition is to design a predictive model that accurately classifies movies into their respective genres based on their titles and synopses.
18
+
19
+ The model takes in inputs such as movie_name and synopsis as a whole string and outputs the predicted genre of the movie.
20
+
21
+
22
+
23
+ - **Developed by:** [Shalaka Thorat]
24
+ - **Shared by:** [Data Driven Science- Movie Genre Prediction Contest: competitions/movie-genre-prediction]
25
+ - **Language:** [Python]
26
+ - **Tags:** [Python, NLP, Sklearn, NLTK, Machine Learning, Multi-class Classification, Supervised Learning]
27
+
28
+ ### Model Sources
29
+
30
+ - **Repository:** [competitions/movie-genre-prediction]
31
+
32
+ ## Training Details
33
+
34
+ We have used Multinomial Naive Bayes Algorithm to work well with Sparse Vectorized data, which consists of movie_name and synopsis.
35
+ The output of the model is a class (out of 10 classes) of the genre.
36
+
37
+ ### Training Data
38
+
39
+ All the Training and Test Data can be found here:
40
+
41
+ [competitions/movie-genre-prediction]
42
+
43
+ #### Preprocessing
44
+
45
+ 1) Label Encoding
46
+ 2) Tokenization
47
+ 3) TF-IDF Vectorization
48
+ 4) Preprocessing of digits, special characters, symbols, extra spaces and stop words from textual data
49
+
50
+ ## Evaluation
51
+
52
+ The evaluation metric used is [Accuracy] as specified in the competition.
53
+