Project: Netflix Data Analysis

A simulated Jupyter Notebook exploring the Netflix catalog.

← Back to Portfolio

1. Loading the Data


import pandas as pd
df = pd.read_csv('netflix_titles.csv')
df.head()
                    
show_idtypetitlecountryrelease_year
s1MovieDick Johnson Is DeadUnited States2020
s2TV ShowBlood & WaterSouth Africa2021
s3TV ShowGanglandsFrance2021
s4TV ShowJailbirds New OrleansUnited States2021
s5TV ShowKota FactoryIndia2021

2. Content Growth Over Time


import matplotlib.pyplot as plt
import seaborn as sns

plt.figure(figsize=(10, 6))
sns.countplot(x='release_year', hue='type', data=df, order=df.release_year.value_counts().iloc[:15].index)
plt.title('Content added over the years')
plt.show()
                    

3. Top 10 Movie Genres


movie_df = df[df['type'] == 'Movie']
plt.figure(figsize=(12, 8))
sns.countplot(y='listed_in', data=movie_df, order=movie_df.listed_in.value_counts().iloc[:10].index)
plt.title('Top 10 Movie Genres')
plt.show()