{ "cells": [ { "cell_type": "markdown", "id": "c5802a21", "metadata": {}, "source": [ "# Project: Movies Recommendation System \n", "\n", " One significant category of machine learning algorithms that provides consumers with \"appropriate\" choices is the recommender system. All three sites—YouTube, Amazon, and Netflix—have systems that suggest videos or products to you based on your past behavior (called content-based filtering) or on the behaviors and preferences of other users who have your interests (Collaborative Filtering).\n", "\n", "Recommendation Systems work based on the similarity between either the content or the users who access the content.There are several ways to measure the similarity between two items. The recommendation systems use this similarity matrix to recommend the next most similar product to the user.\n", "\n", "In this project, we will build a machine learning model that would recommend movies based on a movie the user likes. This Machine Learning model would be based on Cosine Similarity.\n" ] }, { "cell_type": "markdown", "id": "45208056", "metadata": {}, "source": [ "## Importing dependencies" ] }, { "cell_type": "code", "execution_count": 1, "id": "fb34fc04", "metadata": {}, "outputs": [], "source": [ "import os\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns" ] }, { "cell_type": "markdown", "id": "60a47b8f", "metadata": {}, "source": [ "## Loading the Data" ] }, { "cell_type": "code", "execution_count": 2, "id": "22801041", "metadata": {}, "outputs": [], "source": [ "movies = pd.read_csv('tmdb_5000_movies.csv')\n", "credits = pd.read_csv('tmdb_5000_credits.csv')" ] }, { "cell_type": "code", "execution_count": 3, "id": "365e7f1f", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
budgetgenreshomepageidkeywordsoriginal_languageoriginal_titleoverviewpopularityproduction_companiesproduction_countriesrelease_daterevenueruntimespoken_languagesstatustaglinetitlevote_averagevote_count
0237000000[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam...http://www.avatarmovie.com/19995[{\"id\": 1463, \"name\": \"culture clash\"}, {\"id\":...enAvatarIn the 22nd century, a paraplegic Marine is di...150.437577[{\"name\": \"Ingenious Film Partners\", \"id\": 289...[{\"iso_3166_1\": \"US\", \"name\": \"United States o...2009-12-102787965087162.0[{\"iso_639_1\": \"en\", \"name\": \"English\"}, {\"iso...ReleasedEnter the World of Pandora.Avatar7.211800
1300000000[{\"id\": 12, \"name\": \"Adventure\"}, {\"id\": 14, \"...http://disney.go.com/disneypictures/pirates/285[{\"id\": 270, \"name\": \"ocean\"}, {\"id\": 726, \"na...enPirates of the Caribbean: At World's EndCaptain Barbossa, long believed to be dead, ha...139.082615[{\"name\": \"Walt Disney Pictures\", \"id\": 2}, {\"...[{\"iso_3166_1\": \"US\", \"name\": \"United States o...2007-05-19961000000169.0[{\"iso_639_1\": \"en\", \"name\": \"English\"}]ReleasedAt the end of the world, the adventure begins.Pirates of the Caribbean: At World's End6.94500
2245000000[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam...http://www.sonypictures.com/movies/spectre/206647[{\"id\": 470, \"name\": \"spy\"}, {\"id\": 818, \"name...enSpectreA cryptic message from Bond’s past sends him o...107.376788[{\"name\": \"Columbia Pictures\", \"id\": 5}, {\"nam...[{\"iso_3166_1\": \"GB\", \"name\": \"United Kingdom\"...2015-10-26880674609148.0[{\"iso_639_1\": \"fr\", \"name\": \"Fran\\u00e7ais\"},...ReleasedA Plan No One EscapesSpectre6.34466
3250000000[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 80, \"nam...http://www.thedarkknightrises.com/49026[{\"id\": 849, \"name\": \"dc comics\"}, {\"id\": 853,...enThe Dark Knight RisesFollowing the death of District Attorney Harve...112.312950[{\"name\": \"Legendary Pictures\", \"id\": 923}, {\"...[{\"iso_3166_1\": \"US\", \"name\": \"United States o...2012-07-161084939099165.0[{\"iso_639_1\": \"en\", \"name\": \"English\"}]ReleasedThe Legend EndsThe Dark Knight Rises7.69106
4260000000[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam...http://movies.disney.com/john-carter49529[{\"id\": 818, \"name\": \"based on novel\"}, {\"id\":...enJohn CarterJohn Carter is a war-weary, former military ca...43.926995[{\"name\": \"Walt Disney Pictures\", \"id\": 2}][{\"iso_3166_1\": \"US\", \"name\": \"United States o...2012-03-07284139100132.0[{\"iso_639_1\": \"en\", \"name\": \"English\"}]ReleasedLost in our world, found in another.John Carter6.12124
\n", "
" ], "text/plain": [ " budget genres \\\n", "0 237000000 [{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam... \n", "1 300000000 [{\"id\": 12, \"name\": \"Adventure\"}, {\"id\": 14, \"... \n", "2 245000000 [{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam... \n", "3 250000000 [{\"id\": 28, \"name\": \"Action\"}, {\"id\": 80, \"nam... \n", "4 260000000 [{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam... \n", "\n", " homepage id \\\n", "0 http://www.avatarmovie.com/ 19995 \n", "1 http://disney.go.com/disneypictures/pirates/ 285 \n", "2 http://www.sonypictures.com/movies/spectre/ 206647 \n", "3 http://www.thedarkknightrises.com/ 49026 \n", "4 http://movies.disney.com/john-carter 49529 \n", "\n", " keywords original_language \\\n", "0 [{\"id\": 1463, \"name\": \"culture clash\"}, {\"id\":... en \n", "1 [{\"id\": 270, \"name\": \"ocean\"}, {\"id\": 726, \"na... en \n", "2 [{\"id\": 470, \"name\": \"spy\"}, {\"id\": 818, \"name... en \n", "3 [{\"id\": 849, \"name\": \"dc comics\"}, {\"id\": 853,... en \n", "4 [{\"id\": 818, \"name\": \"based on novel\"}, {\"id\":... en \n", "\n", " original_title \\\n", "0 Avatar \n", "1 Pirates of the Caribbean: At World's End \n", "2 Spectre \n", "3 The Dark Knight Rises \n", "4 John Carter \n", "\n", " overview popularity \\\n", "0 In the 22nd century, a paraplegic Marine is di... 150.437577 \n", "1 Captain Barbossa, long believed to be dead, ha... 139.082615 \n", "2 A cryptic message from Bond’s past sends him o... 107.376788 \n", "3 Following the death of District Attorney Harve... 112.312950 \n", "4 John Carter is a war-weary, former military ca... 43.926995 \n", "\n", " production_companies \\\n", "0 [{\"name\": \"Ingenious Film Partners\", \"id\": 289... \n", "1 [{\"name\": \"Walt Disney Pictures\", \"id\": 2}, {\"... \n", "2 [{\"name\": \"Columbia Pictures\", \"id\": 5}, {\"nam... \n", "3 [{\"name\": \"Legendary Pictures\", \"id\": 923}, {\"... \n", "4 [{\"name\": \"Walt Disney Pictures\", \"id\": 2}] \n", "\n", " production_countries release_date revenue \\\n", "0 [{\"iso_3166_1\": \"US\", \"name\": \"United States o... 2009-12-10 2787965087 \n", "1 [{\"iso_3166_1\": \"US\", \"name\": \"United States o... 2007-05-19 961000000 \n", "2 [{\"iso_3166_1\": \"GB\", \"name\": \"United Kingdom\"... 2015-10-26 880674609 \n", "3 [{\"iso_3166_1\": \"US\", \"name\": \"United States o... 2012-07-16 1084939099 \n", "4 [{\"iso_3166_1\": \"US\", \"name\": \"United States o... 2012-03-07 284139100 \n", "\n", " runtime spoken_languages status \\\n", "0 162.0 [{\"iso_639_1\": \"en\", \"name\": \"English\"}, {\"iso... Released \n", "1 169.0 [{\"iso_639_1\": \"en\", \"name\": \"English\"}] Released \n", "2 148.0 [{\"iso_639_1\": \"fr\", \"name\": \"Fran\\u00e7ais\"},... Released \n", "3 165.0 [{\"iso_639_1\": \"en\", \"name\": \"English\"}] Released \n", "4 132.0 [{\"iso_639_1\": \"en\", \"name\": \"English\"}] Released \n", "\n", " tagline \\\n", "0 Enter the World of Pandora. \n", "1 At the end of the world, the adventure begins. \n", "2 A Plan No One Escapes \n", "3 The Legend Ends \n", "4 Lost in our world, found in another. \n", "\n", " title vote_average vote_count \n", "0 Avatar 7.2 11800 \n", "1 Pirates of the Caribbean: At World's End 6.9 4500 \n", "2 Spectre 6.3 4466 \n", "3 The Dark Knight Rises 7.6 9106 \n", "4 John Carter 6.1 2124 " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies.head(5)" ] }, { "cell_type": "code", "execution_count": 4, "id": "f161a85b", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
movie_idtitlecastcrew
019995Avatar[{\"cast_id\": 242, \"character\": \"Jake Sully\", \"...[{\"credit_id\": \"52fe48009251416c750aca23\", \"de...
1285Pirates of the Caribbean: At World's End[{\"cast_id\": 4, \"character\": \"Captain Jack Spa...[{\"credit_id\": \"52fe4232c3a36847f800b579\", \"de...
2206647Spectre[{\"cast_id\": 1, \"character\": \"James Bond\", \"cr...[{\"credit_id\": \"54805967c3a36829b5002c41\", \"de...
349026The Dark Knight Rises[{\"cast_id\": 2, \"character\": \"Bruce Wayne / Ba...[{\"credit_id\": \"52fe4781c3a36847f81398c3\", \"de...
449529John Carter[{\"cast_id\": 5, \"character\": \"John Carter\", \"c...[{\"credit_id\": \"52fe479ac3a36847f813eaa3\", \"de...
\n", "
" ], "text/plain": [ " movie_id title \\\n", "0 19995 Avatar \n", "1 285 Pirates of the Caribbean: At World's End \n", "2 206647 Spectre \n", "3 49026 The Dark Knight Rises \n", "4 49529 John Carter \n", "\n", " cast \\\n", "0 [{\"cast_id\": 242, \"character\": \"Jake Sully\", \"... \n", "1 [{\"cast_id\": 4, \"character\": \"Captain Jack Spa... \n", "2 [{\"cast_id\": 1, \"character\": \"James Bond\", \"cr... \n", "3 [{\"cast_id\": 2, \"character\": \"Bruce Wayne / Ba... \n", "4 [{\"cast_id\": 5, \"character\": \"John Carter\", \"c... \n", "\n", " crew \n", "0 [{\"credit_id\": \"52fe48009251416c750aca23\", \"de... \n", "1 [{\"credit_id\": \"52fe4232c3a36847f800b579\", \"de... \n", "2 [{\"credit_id\": \"54805967c3a36829b5002c41\", \"de... \n", "3 [{\"credit_id\": \"52fe4781c3a36847f81398c3\", \"de... \n", "4 [{\"credit_id\": \"52fe479ac3a36847f813eaa3\", \"de... " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "credits.head(5)" ] }, { "cell_type": "code", "execution_count": 5, "id": "850bed94", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 4803 entries, 0 to 4802\n", "Data columns (total 20 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 budget 4803 non-null int64 \n", " 1 genres 4803 non-null object \n", " 2 homepage 1712 non-null object \n", " 3 id 4803 non-null int64 \n", " 4 keywords 4803 non-null object \n", " 5 original_language 4803 non-null object \n", " 6 original_title 4803 non-null object \n", " 7 overview 4800 non-null object \n", " 8 popularity 4803 non-null float64\n", " 9 production_companies 4803 non-null object \n", " 10 production_countries 4803 non-null object \n", " 11 release_date 4802 non-null object \n", " 12 revenue 4803 non-null int64 \n", " 13 runtime 4801 non-null float64\n", " 14 spoken_languages 4803 non-null object \n", " 15 status 4803 non-null object \n", " 16 tagline 3959 non-null object \n", " 17 title 4803 non-null object \n", " 18 vote_average 4803 non-null float64\n", " 19 vote_count 4803 non-null int64 \n", "dtypes: float64(3), int64(4), object(13)\n", "memory usage: 750.6+ KB\n" ] } ], "source": [ "movies.info()" ] }, { "cell_type": "code", "execution_count": 6, "id": "a6859a8a", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 4803 entries, 0 to 4802\n", "Data columns (total 4 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 movie_id 4803 non-null int64 \n", " 1 title 4803 non-null object\n", " 2 cast 4803 non-null object\n", " 3 crew 4803 non-null object\n", "dtypes: int64(1), object(3)\n", "memory usage: 150.2+ KB\n" ] } ], "source": [ "credits.info()" ] }, { "cell_type": "markdown", "id": "1f1a4038", "metadata": {}, "source": [ "## Merging both dataframes : Movies & Credits" ] }, { "cell_type": "code", "execution_count": 7, "id": "26168071", "metadata": {}, "outputs": [], "source": [ "movies = movies.merge(credits,on='title')" ] }, { "cell_type": "code", "execution_count": 8, "id": "28d61e2b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(4809, 23)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies.shape" ] }, { "cell_type": "markdown", "id": "958ed596", "metadata": {}, "source": [ "## Data Pre-Processing" ] }, { "cell_type": "markdown", "id": "6cca2e2b", "metadata": {}, "source": [ "## Important columns to be used in recommendation system : \n", "\n", "* genres\n", "* id\n", "* keywords\n", "* title\n", "* overview\n", "* cast\n", "* crew \n", " \n", "Extracting these data and creating all the above mentioned features from the given data." ] }, { "cell_type": "code", "execution_count": 9, "id": "0cb8d18f", "metadata": {}, "outputs": [], "source": [ "movies = movies[['movie_id','title','overview','genres','cast','keywords','crew']]" ] }, { "cell_type": "code", "execution_count": 10, "id": "7cbdf69e", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
movie_idtitleoverviewgenrescastkeywordscrew
019995AvatarIn the 22nd century, a paraplegic Marine is di...[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam...[{\"cast_id\": 242, \"character\": \"Jake Sully\", \"...[{\"id\": 1463, \"name\": \"culture clash\"}, {\"id\":...[{\"credit_id\": \"52fe48009251416c750aca23\", \"de...
1285Pirates of the Caribbean: At World's EndCaptain Barbossa, long believed to be dead, ha...[{\"id\": 12, \"name\": \"Adventure\"}, {\"id\": 14, \"...[{\"cast_id\": 4, \"character\": \"Captain Jack Spa...[{\"id\": 270, \"name\": \"ocean\"}, {\"id\": 726, \"na...[{\"credit_id\": \"52fe4232c3a36847f800b579\", \"de...
2206647SpectreA cryptic message from Bond’s past sends him o...[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam...[{\"cast_id\": 1, \"character\": \"James Bond\", \"cr...[{\"id\": 470, \"name\": \"spy\"}, {\"id\": 818, \"name...[{\"credit_id\": \"54805967c3a36829b5002c41\", \"de...
349026The Dark Knight RisesFollowing the death of District Attorney Harve...[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 80, \"nam...[{\"cast_id\": 2, \"character\": \"Bruce Wayne / Ba...[{\"id\": 849, \"name\": \"dc comics\"}, {\"id\": 853,...[{\"credit_id\": \"52fe4781c3a36847f81398c3\", \"de...
449529John CarterJohn Carter is a war-weary, former military ca...[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam...[{\"cast_id\": 5, \"character\": \"John Carter\", \"c...[{\"id\": 818, \"name\": \"based on novel\"}, {\"id\":...[{\"credit_id\": \"52fe479ac3a36847f813eaa3\", \"de...
\n", "
" ], "text/plain": [ " movie_id title \\\n", "0 19995 Avatar \n", "1 285 Pirates of the Caribbean: At World's End \n", "2 206647 Spectre \n", "3 49026 The Dark Knight Rises \n", "4 49529 John Carter \n", "\n", " overview \\\n", "0 In the 22nd century, a paraplegic Marine is di... \n", "1 Captain Barbossa, long believed to be dead, ha... \n", "2 A cryptic message from Bond’s past sends him o... \n", "3 Following the death of District Attorney Harve... \n", "4 John Carter is a war-weary, former military ca... \n", "\n", " genres \\\n", "0 [{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam... \n", "1 [{\"id\": 12, \"name\": \"Adventure\"}, {\"id\": 14, \"... \n", "2 [{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam... \n", "3 [{\"id\": 28, \"name\": \"Action\"}, {\"id\": 80, \"nam... \n", "4 [{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"nam... \n", "\n", " cast \\\n", "0 [{\"cast_id\": 242, \"character\": \"Jake Sully\", \"... \n", "1 [{\"cast_id\": 4, \"character\": \"Captain Jack Spa... \n", "2 [{\"cast_id\": 1, \"character\": \"James Bond\", \"cr... \n", "3 [{\"cast_id\": 2, \"character\": \"Bruce Wayne / Ba... \n", "4 [{\"cast_id\": 5, \"character\": \"John Carter\", \"c... \n", "\n", " keywords \\\n", "0 [{\"id\": 1463, \"name\": \"culture clash\"}, {\"id\":... \n", "1 [{\"id\": 270, \"name\": \"ocean\"}, {\"id\": 726, \"na... \n", "2 [{\"id\": 470, \"name\": \"spy\"}, {\"id\": 818, \"name... \n", "3 [{\"id\": 849, \"name\": \"dc comics\"}, {\"id\": 853,... \n", "4 [{\"id\": 818, \"name\": \"based on novel\"}, {\"id\":... \n", "\n", " crew \n", "0 [{\"credit_id\": \"52fe48009251416c750aca23\", \"de... \n", "1 [{\"credit_id\": \"52fe4232c3a36847f800b579\", \"de... \n", "2 [{\"credit_id\": \"54805967c3a36829b5002c41\", \"de... \n", "3 [{\"credit_id\": \"52fe4781c3a36847f81398c3\", \"de... \n", "4 [{\"credit_id\": \"52fe479ac3a36847f813eaa3\", \"de... " ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies.head(5)" ] }, { "cell_type": "markdown", "id": "47a979b2", "metadata": {}, "source": [ "## Missing Values" ] }, { "cell_type": "code", "execution_count": 11, "id": "7d57cfb0", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "movie_id 0\n", "title 0\n", "overview 3\n", "genres 0\n", "cast 0\n", "keywords 0\n", "crew 0\n", "dtype: int64" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Checking for Missing Values\n", "movies.isnull().sum()\n", " " ] }, { "cell_type": "code", "execution_count": 12, "id": "e43e3fc0", "metadata": {}, "outputs": [], "source": [ "#Dropping the missing values\n", "movies.dropna(inplace=True)" ] }, { "cell_type": "code", "execution_count": 13, "id": "4165f068", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "movie_id 0\n", "title 0\n", "overview 0\n", "genres 0\n", "cast 0\n", "keywords 0\n", "crew 0\n", "dtype: int64" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Checking again after dropping the missing values\n", "movies.isnull().sum()" ] }, { "cell_type": "code", "execution_count": 14, "id": "7043f76e", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Checkinf for any duplication in data\n", "movies.duplicated().sum()" ] }, { "cell_type": "code", "execution_count": 15, "id": "a7bd44cd", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'[{\"id\": 28, \"name\": \"Action\"}, {\"id\": 12, \"name\": \"Adventure\"}, {\"id\": 14, \"name\": \"Fantasy\"}, {\"id\": 878, \"name\": \"Science Fiction\"}]'" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#checking genres randomly using index position as 0\n", "movies.iloc[0].genres" ] }, { "cell_type": "code", "execution_count": 16, "id": "e69162a1", "metadata": {}, "outputs": [], "source": [ "#AST's(Abstract Syntax Tree (AST)) are mainly used in compilers to check code for their accuracy.\n", "#Because some constructs cannot be represented in context-free grammar, such as implicit typing.\n", "import ast" ] }, { "cell_type": "markdown", "id": "eaa1ec3f", "metadata": {}, "source": [ "ast.literal_eval raises an exception if the input isn't a valid Python datatype, so the code won't be executed if it's not." ] }, { "cell_type": "markdown", "id": "7a22e725", "metadata": {}, "source": [ "### Function for extracting values from raw data for the creation of tags" ] }, { "cell_type": "code", "execution_count": 17, "id": "2a5e99a2", "metadata": {}, "outputs": [], "source": [ "#Extracting genres,keywords from raw data for the creation of tags\n", "#Creating a fuction convert \n", "\n", "def convert(obj):\n", " L = []\n", " for i in ast.literal_eval(obj):\n", " L.append(i['name'])\n", " return L" ] }, { "cell_type": "markdown", "id": "57e31dc7", "metadata": {}, "source": [ "### Extracting Genres" ] }, { "cell_type": "code", "execution_count": 18, "id": "94540349", "metadata": {}, "outputs": [], "source": [ "#Applying the convert function to genres column to extract the required data\n", "movies['genres'] = movies['genres'].apply(convert)" ] }, { "cell_type": "markdown", "id": "ba5a12e5", "metadata": {}, "source": [ "### Extracting Keywords" ] }, { "cell_type": "code", "execution_count": 19, "id": "3f486c75", "metadata": {}, "outputs": [], "source": [ "#Applying the convert function to keyword column to extract the required data\n", "movies['keywords'] = movies['keywords'].apply(convert)" ] }, { "cell_type": "markdown", "id": "ad28db1c", "metadata": {}, "source": [ "### Function for extracting top 3 actors from the movie" ] }, { "cell_type": "code", "execution_count": 20, "id": "2e4317e4", "metadata": {}, "outputs": [], "source": [ "# Creating a function for extracting top 3 actors from the movie \n", " \n", "def convert3(obj):\n", " L=[]\n", " counter=0\n", " for i in ast.literal_eval(obj):\n", " if counter !=3:\n", " L.append(i['name'])\n", " counter+=1\n", " else:\n", " break\n", " return L" ] }, { "cell_type": "code", "execution_count": 21, "id": "69bfef22", "metadata": {}, "outputs": [], "source": [ "#Applying the convert3 function to cast column to extract the required data\n", "movies['cast'] = movies['cast'].apply(convert3)" ] }, { "cell_type": "markdown", "id": "31059f45", "metadata": {}, "source": [ "### Function to fetch the director of movie from crew column" ] }, { "cell_type": "code", "execution_count": 22, "id": "c2bebb24", "metadata": {}, "outputs": [], "source": [ "#Creating a function to fetch the director of movie from crew column\n", "def fetch_director(obj):\n", " L=[]\n", " for i in ast.literal_eval(obj):\n", " if i['job'] == 'Director':\n", " L.append(i['name'])\n", " break\n", " return L" ] }, { "cell_type": "code", "execution_count": 23, "id": "1f092c8a", "metadata": {}, "outputs": [], "source": [ "# Applying the fetch_director function to cast column to extract the required data\n", "movies['crew'] = movies['crew'].apply(fetch_director)\n", " " ] }, { "cell_type": "code", "execution_count": 24, "id": "fd484435", "metadata": {}, "outputs": [], "source": [ "#Converting Overviewcolumn data to an array \n", "movies['overview'] = movies['overview'].apply(lambda x:x.split())" ] }, { "cell_type": "markdown", "id": "c13f649f", "metadata": {}, "source": [ "Here, I am trying to replace this kind of answer in my data frame : case_1 case_2 case_3 by : [case_1,case_2,case_3] .apply(lambda x: x.split()) seems to be a good way to do it" ] }, { "cell_type": "markdown", "id": "20cc620f", "metadata": {}, "source": [ "### Checking the Data after extracting all the required values" ] }, { "cell_type": "code", "execution_count": 25, "id": "422e70e8", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
movie_idtitleoverviewgenrescastkeywordscrew
019995Avatar[In, the, 22nd, century,, a, paraplegic, Marin...[Action, Adventure, Fantasy, Science Fiction][Sam Worthington, Zoe Saldana, Sigourney Weaver][culture clash, future, space war, space colon...[James Cameron]
1285Pirates of the Caribbean: At World's End[Captain, Barbossa,, long, believed, to, be, d...[Adventure, Fantasy, Action][Johnny Depp, Orlando Bloom, Keira Knightley][ocean, drug abuse, exotic island, east india ...[Gore Verbinski]
2206647Spectre[A, cryptic, message, from, Bond’s, past, send...[Action, Adventure, Crime][Daniel Craig, Christoph Waltz, Léa Seydoux][spy, based on novel, secret agent, sequel, mi...[Sam Mendes]
349026The Dark Knight Rises[Following, the, death, of, District, Attorney...[Action, Crime, Drama, Thriller][Christian Bale, Michael Caine, Gary Oldman][dc comics, crime fighter, terrorist, secret i...[Christopher Nolan]
449529John Carter[John, Carter, is, a, war-weary,, former, mili...[Action, Adventure, Science Fiction][Taylor Kitsch, Lynn Collins, Samantha Morton][based on novel, mars, medallion, space travel...[Andrew Stanton]
\n", "
" ], "text/plain": [ " movie_id title \\\n", "0 19995 Avatar \n", "1 285 Pirates of the Caribbean: At World's End \n", "2 206647 Spectre \n", "3 49026 The Dark Knight Rises \n", "4 49529 John Carter \n", "\n", " overview \\\n", "0 [In, the, 22nd, century,, a, paraplegic, Marin... \n", "1 [Captain, Barbossa,, long, believed, to, be, d... \n", "2 [A, cryptic, message, from, Bond’s, past, send... \n", "3 [Following, the, death, of, District, Attorney... \n", "4 [John, Carter, is, a, war-weary,, former, mili... \n", "\n", " genres \\\n", "0 [Action, Adventure, Fantasy, Science Fiction] \n", "1 [Adventure, Fantasy, Action] \n", "2 [Action, Adventure, Crime] \n", "3 [Action, Crime, Drama, Thriller] \n", "4 [Action, Adventure, Science Fiction] \n", "\n", " cast \\\n", "0 [Sam Worthington, Zoe Saldana, Sigourney Weaver] \n", "1 [Johnny Depp, Orlando Bloom, Keira Knightley] \n", "2 [Daniel Craig, Christoph Waltz, Léa Seydoux] \n", "3 [Christian Bale, Michael Caine, Gary Oldman] \n", "4 [Taylor Kitsch, Lynn Collins, Samantha Morton] \n", "\n", " keywords crew \n", "0 [culture clash, future, space war, space colon... [James Cameron] \n", "1 [ocean, drug abuse, exotic island, east india ... [Gore Verbinski] \n", "2 [spy, based on novel, secret agent, sequel, mi... [Sam Mendes] \n", "3 [dc comics, crime fighter, terrorist, secret i... [Christopher Nolan] \n", "4 [based on novel, mars, medallion, space travel... [Andrew Stanton] " ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Checking the Final Data after extracting all the required values\n", "movies.head(5)" ] }, { "cell_type": "code", "execution_count": 26, "id": "5f0d2e91", "metadata": {}, "outputs": [], "source": [ "#Applying a transformation to remove spaces between words \n", "\n", "movies['genres'] = movies['genres'].apply(lambda x:[i.replace(\" \",\"\") for i in x])\n", "movies['keywords'] = movies['keywords'].apply(lambda x:[i.replace(\" \",\"\") for i in x])\n", "movies['cast'] = movies['cast'].apply(lambda x:[i.replace(\" \",\"\") for i in x])\n", "movies['crew'] = movies['crew'].apply(lambda x:[i.replace(\" \",\"\") for i in x])" ] }, { "cell_type": "code", "execution_count": 27, "id": "50908dd1", "metadata": {}, "outputs": [], "source": [ "# In the tags column inserting all the data to use it to create my recommendation system\n", "movies['tags'] = movies['overview'] + movies['genres'] + movies['keywords'] + movies['cast'] + movies['crew']" ] }, { "cell_type": "code", "execution_count": 28, "id": "eacb1e82", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
movie_idtitleoverviewgenrescastkeywordscrewtags
019995Avatar[In, the, 22nd, century,, a, paraplegic, Marin...[Action, Adventure, Fantasy, ScienceFiction][SamWorthington, ZoeSaldana, SigourneyWeaver][cultureclash, future, spacewar, spacecolony, ...[JamesCameron][In, the, 22nd, century,, a, paraplegic, Marin...
1285Pirates of the Caribbean: At World's End[Captain, Barbossa,, long, believed, to, be, d...[Adventure, Fantasy, Action][JohnnyDepp, OrlandoBloom, KeiraKnightley][ocean, drugabuse, exoticisland, eastindiatrad...[GoreVerbinski][Captain, Barbossa,, long, believed, to, be, d...
2206647Spectre[A, cryptic, message, from, Bond’s, past, send...[Action, Adventure, Crime][DanielCraig, ChristophWaltz, LéaSeydoux][spy, basedonnovel, secretagent, sequel, mi6, ...[SamMendes][A, cryptic, message, from, Bond’s, past, send...
349026The Dark Knight Rises[Following, the, death, of, District, Attorney...[Action, Crime, Drama, Thriller][ChristianBale, MichaelCaine, GaryOldman][dccomics, crimefighter, terrorist, secretiden...[ChristopherNolan][Following, the, death, of, District, Attorney...
449529John Carter[John, Carter, is, a, war-weary,, former, mili...[Action, Adventure, ScienceFiction][TaylorKitsch, LynnCollins, SamanthaMorton][basedonnovel, mars, medallion, spacetravel, p...[AndrewStanton][John, Carter, is, a, war-weary,, former, mili...
\n", "
" ], "text/plain": [ " movie_id title \\\n", "0 19995 Avatar \n", "1 285 Pirates of the Caribbean: At World's End \n", "2 206647 Spectre \n", "3 49026 The Dark Knight Rises \n", "4 49529 John Carter \n", "\n", " overview \\\n", "0 [In, the, 22nd, century,, a, paraplegic, Marin... \n", "1 [Captain, Barbossa,, long, believed, to, be, d... \n", "2 [A, cryptic, message, from, Bond’s, past, send... \n", "3 [Following, the, death, of, District, Attorney... \n", "4 [John, Carter, is, a, war-weary,, former, mili... \n", "\n", " genres \\\n", "0 [Action, Adventure, Fantasy, ScienceFiction] \n", "1 [Adventure, Fantasy, Action] \n", "2 [Action, Adventure, Crime] \n", "3 [Action, Crime, Drama, Thriller] \n", "4 [Action, Adventure, ScienceFiction] \n", "\n", " cast \\\n", "0 [SamWorthington, ZoeSaldana, SigourneyWeaver] \n", "1 [JohnnyDepp, OrlandoBloom, KeiraKnightley] \n", "2 [DanielCraig, ChristophWaltz, LéaSeydoux] \n", "3 [ChristianBale, MichaelCaine, GaryOldman] \n", "4 [TaylorKitsch, LynnCollins, SamanthaMorton] \n", "\n", " keywords crew \\\n", "0 [cultureclash, future, spacewar, spacecolony, ... [JamesCameron] \n", "1 [ocean, drugabuse, exoticisland, eastindiatrad... [GoreVerbinski] \n", "2 [spy, basedonnovel, secretagent, sequel, mi6, ... [SamMendes] \n", "3 [dccomics, crimefighter, terrorist, secretiden... [ChristopherNolan] \n", "4 [basedonnovel, mars, medallion, spacetravel, p... [AndrewStanton] \n", "\n", " tags \n", "0 [In, the, 22nd, century,, a, paraplegic, Marin... \n", "1 [Captain, Barbossa,, long, believed, to, be, d... \n", "2 [A, cryptic, message, from, Bond’s, past, send... \n", "3 [Following, the, death, of, District, Attorney... \n", "4 [John, Carter, is, a, war-weary,, former, mili... " ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies.head()" ] }, { "cell_type": "code", "execution_count": 29, "id": "7fd341dc", "metadata": {}, "outputs": [], "source": [ "#Craeting a new dataframe with 3 columns \n", "new_df = movies[['movie_id','title','tags']]" ] }, { "cell_type": "code", "execution_count": 30, "id": "bf6743a7", "metadata": {}, "outputs": [], "source": [ "# Supressing the warning messages\n", "import warnings\n", "warnings.filterwarnings('ignore')\n", "\n", "#Joining all the data togther\n", "new_df['tags'] = new_df['tags'].apply(lambda x:\" \".join(x))" ] }, { "cell_type": "code", "execution_count": 31, "id": "e2d1d383", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
movie_idtitletags
019995AvatarIn the 22nd century, a paraplegic Marine is di...
1285Pirates of the Caribbean: At World's EndCaptain Barbossa, long believed to be dead, ha...
2206647SpectreA cryptic message from Bond’s past sends him o...
349026The Dark Knight RisesFollowing the death of District Attorney Harve...
449529John CarterJohn Carter is a war-weary, former military ca...
\n", "
" ], "text/plain": [ " movie_id title \\\n", "0 19995 Avatar \n", "1 285 Pirates of the Caribbean: At World's End \n", "2 206647 Spectre \n", "3 49026 The Dark Knight Rises \n", "4 49529 John Carter \n", "\n", " tags \n", "0 In the 22nd century, a paraplegic Marine is di... \n", "1 Captain Barbossa, long believed to be dead, ha... \n", "2 A cryptic message from Bond’s past sends him o... \n", "3 Following the death of District Attorney Harve... \n", "4 John Carter is a war-weary, former military ca... " ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Checking the new data\n", "new_df.head()" ] }, { "cell_type": "code", "execution_count": 32, "id": "a92ee349", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'In the 22nd century, a paraplegic Marine is dispatched to the moon Pandora on a unique mission, but becomes torn between following orders and protecting an alien civilization. Action Adventure Fantasy ScienceFiction cultureclash future spacewar spacecolony society spacetravel futuristic romance space alien tribe alienplanet cgi marine soldier battle loveaffair antiwar powerrelations mindandsoul 3d SamWorthington ZoeSaldana SigourneyWeaver JamesCameron'" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Checking one of the tags to check how the data looks\n", "new_df['tags'][0]" ] }, { "cell_type": "code", "execution_count": 33, "id": "383b01eb", "metadata": {}, "outputs": [], "source": [ "#Converting the tags data into lowercase\n", "new_df['tags'] = new_df['tags'].apply(lambda x:x.lower())" ] }, { "cell_type": "code", "execution_count": 34, "id": "5f145429", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'in the 22nd century, a paraplegic marine is dispatched to the moon pandora on a unique mission, but becomes torn between following orders and protecting an alien civilization. action adventure fantasy sciencefiction cultureclash future spacewar spacecolony society spacetravel futuristic romance space alien tribe alienplanet cgi marine soldier battle loveaffair antiwar powerrelations mindandsoul 3d samworthington zoesaldana sigourneyweaver jamescameron'" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Checking again after applying the lower case function\n", "new_df['tags'][0]" ] }, { "cell_type": "markdown", "id": "e28bcd88", "metadata": {}, "source": [ "## Text Vectorization" ] }, { "cell_type": "code", "execution_count": 35, "id": "e7bdc9f6", "metadata": {}, "outputs": [], "source": [ "#Importing this module to convert a collection of text documents to a matrix of token counts.\n", "from sklearn.feature_extraction.text import CountVectorizer\n", " " ] }, { "cell_type": "code", "execution_count": 36, "id": "421b740a", "metadata": {}, "outputs": [], "source": [ "#Creating a variable cv to convert text to vector\n", "cv = CountVectorizer(max_features=5000,stop_words='english')" ] }, { "cell_type": "code", "execution_count": 37, "id": "7f40be01", "metadata": {}, "outputs": [], "source": [ "# Transforming the data to vectors and storing as an array\n", "vectors = cv.fit_transform(new_df['tags']).toarray()" ] }, { "cell_type": "code", "execution_count": 38, "id": "29c36bf0", "metadata": {}, "outputs": [], "source": [ "## Most frequent 5000 words\n", "# cv.get_feature_names()" ] }, { "cell_type": "markdown", "id": "4a3e66f6", "metadata": {}, "source": [ "## Applying Stemming Process" ] }, { "cell_type": "markdown", "id": "8cbff737", "metadata": {}, "source": [ "Stemming is a natural language processing technique that lowers inflection in words to their root forms, hence aiding in the preprocessing of text, words, and documents for text normalization.Simply put it is reducing the words or chopping the words into their root forms for e.g eating becomes eat and so on. So in stemming there are different stemmers and we are going to discuss PortersStemmer the most popularly used one.\n", "\n", "Porters Stemmer It is a type of stemmer which is mainly known for Data Mining and Information Retrieval. As its applications are limited to the English language only. It is based on the idea that the suffixes in the English language are made up of a combination of smaller and simpler suffixes, it is also majorly known for its simplicity and speed. The advantage is, it produces the best output from other stemmers and has less error rate." ] }, { "cell_type": "code", "execution_count": 39, "id": "26ff7212", "metadata": {}, "outputs": [], "source": [ "#Importing the NLTK library for stemming process\n", "import nltk " ] }, { "cell_type": "code", "execution_count": 40, "id": "3f2e9abf", "metadata": {}, "outputs": [], "source": [ "#From NLTK import PorterStemmer & then Creating a variable and storing PorterStemmer into it\n", "from nltk.stem.porter import PorterStemmer\n", "ps = PorterStemmer()\n", " " ] }, { "cell_type": "code", "execution_count": 41, "id": "5c7cd073", "metadata": {}, "outputs": [], "source": [ "#Defining the stemming function\n", "def stem(text):\n", " y=[]\n", " for i in text.split():\n", " y.append(ps.stem(i))\n", " return \" \".join(y)\n", " " ] }, { "cell_type": "code", "execution_count": 42, "id": "9fab33f0", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'in the 22nd century, a parapleg marin is dispatch to the moon pandora on a uniqu mission, but becom torn between follow order and protect an alien civilization. action adventur fantasi sciencefict cultureclash futur spacewar spacecoloni societi spacetravel futurist romanc space alien tribe alienplanet cgi marin soldier battl loveaffair antiwar powerrel mindandsoul 3d samworthington zoesaldana sigourneyweav jamescameron'" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Checking on the sample text\n", "stem('In the 22nd century, a paraplegic Marine is dispatched to the moon Pandora on a unique mission, but becomes torn between following orders and protecting an alien civilization. Action Adventure Fantasy ScienceFiction cultureclash future spacewar spacecolony society spacetravel futuristic romance space alien tribe alienplanet cgi marine soldier battle loveaffair antiwar powerrelations mindandsoul 3d SamWorthington ZoeSaldana SigourneyWeaver JamesCameron')\n", " " ] }, { "cell_type": "code", "execution_count": 43, "id": "7044bea2", "metadata": {}, "outputs": [], "source": [ "#Applying the stemming function to the tags column in our new data\n", "new_df['tags'] = new_df['tags'].apply(stem)" ] }, { "cell_type": "markdown", "id": "0beb8098", "metadata": {}, "source": [ "## Similarity Measures" ] }, { "cell_type": "markdown", "id": "76adf86f", "metadata": {}, "source": [ "Here, in this case-study We will use the Cosine Similarity from Sklearn, as the metric to compute the similarity between two movies.\n", "\n", "Cosine similarity is a metric used to measure how similar two items are. Mathematically, it measures the cosine of the angle between two vectors projected in a multi-dimensional space. The output value ranges from 0–1.\n", "\n", "0 means no similarity, where as 1 means that both the items are 100% similar.\n", "\n" ] }, { "cell_type": "code", "execution_count": 44, "id": "4a7cb6e7", "metadata": {}, "outputs": [], "source": [ "#importing the cosine similarity from sklearn\n", "from sklearn.metrics.pairwise import cosine_similarity" ] }, { "cell_type": "code", "execution_count": 45, "id": "9957c15b", "metadata": {}, "outputs": [], "source": [ "#Creating a variable similarity and computing cosine_similarity of the vector\n", "similarity = cosine_similarity(vectors)\n" ] }, { "cell_type": "markdown", "id": "6f626560", "metadata": {}, "source": [ "## Making the recommendation function" ] }, { "cell_type": "code", "execution_count": 46, "id": "9089b7cb", "metadata": {}, "outputs": [], "source": [ "#Creating the function for Movie Recommendation using cosine similarity\n", "def recommend(movie):\n", " #Get the index from the name of the movie input\n", " movie_index = new_df[new_df['title'] == movie].index[0] \n", " #Generating similar movies\n", " distances = similarity[movie_index] \n", " #Generate a list of similar movies\n", " #sorting the movies in the list similar_movies. We have used the parameter reverse=True \n", " #since we want the list of 5 in the descending order,with the most similar item at the top\n", " movies_list = sorted(list(enumerate(distances)),reverse=True, key=lambda x:x[1])[1:6] \n", " \n", " \n", " for i in movies_list:\n", " print(new_df.iloc[i[0]].title)" ] }, { "cell_type": "markdown", "id": "07001d9a", "metadata": {}, "source": [ "## Recommendation" ] }, { "cell_type": "code", "execution_count": 47, "id": "2229cfab", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The Dark Knight\n", "The Dark Knight Rises\n", "Batman\n", "Batman & Robin\n", "Batman\n" ] } ], "source": [ "#Enter movies only which are in the dataset, otherwise it would result in error\n", "recommend('Batman Begins') " ] }, { "cell_type": "code", "execution_count": 48, "id": "692a4331", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "movie_id 440\n", "title Aliens vs Predator: Requiem\n", "tags a sequel to 2004' alien vs. predator, the icon...\n", "Name: 1216, dtype: object" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "new_df.iloc[1216]" ] }, { "cell_type": "markdown", "id": "7fbbcb1c", "metadata": {}, "source": [ "## Exporting the Model" ] }, { "cell_type": "code", "execution_count": 49, "id": "d8b99651", "metadata": {}, "outputs": [], "source": [ "import pickle" ] }, { "cell_type": "code", "execution_count": 50, "id": "2d34f863", "metadata": {}, "outputs": [], "source": [ "pickle.dump(new_df,open('movies.pkl','wb'))" ] }, { "cell_type": "code", "execution_count": 51, "id": "11c31baa", "metadata": {}, "outputs": [], "source": [ "pickle.dump(new_df.to_dict(),open('movie_dict.pkl','wb'))\n", " " ] }, { "cell_type": "code", "execution_count": 52, "id": "0a7654a3", "metadata": {}, "outputs": [], "source": [ "pickle.dump(similarity,open('similarity.pkl','wb'))\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" } }, "nbformat": 4, "nbformat_minor": 5 }