Mireia Duaso Bellido

Python

TMDb Ratings

The focus of this project is to study how the movie vote average is related to three of its attributes: budget, length, and genre.

Movie length vs Audience rating

The correlation between length and average vote is 0.352. The takeaway is that although there is some positive correlation between movie duration and rating, the relation is not very strong. By looking at the graphic, it is possible to assess that long movies, over 150 minutes, tend to have higher ratings than the average, 6.17.

Genre vs Audience rating

By the boxplots below, there does not seem to be a very strong relation between genres and ratings. The horizontal red line represents the mean of the vote average for all the dataset and all genres means are very close to the line.

Budget vs Audience rating

Looking at the graphics below and taking into account the correlation coefficient between budget and ratings, 0.037, it is clear that there is no linear relation between the variables.

References

About the dataset

This dataset is cleaned from an original dataset from Kaggle and contains information about 10,000 movies collected from The Movie Database (TMDb), including user ratings.

About the Jupyter notebook

The notebook is divided in four sections, Introduction, Data Wrangling, Exploratory Data Analysis, and Conclusions.

Do you have any questions?