Analyzing Movie Ratings via SVD
- 25 million ratings
- from 162,000 users
- on 62,000 movies
The dataset was generated on November 21st, 2019, so it is pretty current. In this activity, we're going to be using a reduced subset of this data that only includes popular movies (that have been rated at least 1000 times) and users that have rated lots of movies (at least 500). This leaves us with
- 7.1 million ratings
- from 9,663 users
- on 3,790 movies
The goals of this activity are threefold.
- To work with a different type of data than images or temperatures (here we will be working with ratings). Applying the tools you have learned in this module to different domains will help solidify your learning, help you see connections, and potentially get you excited for your module 1 project.
- To see how SVD can be used to examine the important trends in your data (since we had lots of practice with using the EVD on the overnight).
- To have some fun!
To get started, we're going to load the data and display a little bit of the data. Please see the comments in the code for some more information.
load('movielens25m.mat');
sizeOfMovies = size(movies)
% the cell array `movies` is 3706 by 3. Each of the 3706 entries correspopnds to a
% particular movie, and along the second dimension the entries correspond to the movie ID,
% the movie title, and the movie genre
% Here we extract the information about the first movie in the dataset
[movieId, movieTitle, movieGenre] = movies{1,:}
movieId = int64
1
movieTitle = 'Toy Story (1995)'
movieGenre = 'Adventure|Animation|Children|Comedy|Fantasy'
ratingsSize = size(ratings)
% the matrix `ratings` is 6040 by 3706 and encodes the rating that a
% particular user (row) gave to a particular movie (column). The ratings
% are 1, 2, 3, 4, or 5 stars or the special value NaN (not a number) if the
% user didn't rate that particular movie.
% Let's look at the ratings that were given to the first movie in the
% dataset, which as we saw is Toy Story. We can do this using the histc
% function (we'll ignore missing values in this analysis)
possibleRatings = [0.5:0.5:5];
nRatings = histc(ratings(1,:), possibleRatings);
bar(possibleRatings, nRatings);
ylabel('Number of Users');
title(['Ratings for ', movieTitle])
[movieId, movieTitle, movieGenre] = movies{anacondaIndex,:}
movieId = int64
1499
movieTitle = 'Anaconda (1997)'
movieGenre = 'Action|Adventure|Thriller'
nRatings = histc(ratings(anacondaIndex,:), possibleRatings);
bar(possibleRatings, nRatings);
ylabel('Number of Users');
title(['Ratings for ', movieTitle])
Cleaning up the Data
As you probably guessed, we're going to be applying SVD to this data. Before we start analyzing this data, we're going to do a few things to make the problem a bit easier to handle. First we're going to have to deal with the fact that we have a bunch of missing values in our ratings matrix (i.e., movies that particular users did not rate). The step of filling in missing values is called data imputation. There are many ways to do this, but we've chosen a particularly easy strategy of simply replacing any ratings with the average rating of that particular movie (e.g., if a user didn't rate Toy Story, we would fill it in with the average rating of Toy Story based on the other users in the dataset who actually rated that movie).
ratingsFilled = fillmissing(ratings, 'constant', nanmean(ratings));
As a final data cleaning step, we're going to subtract out the mean of each row. This will control for the fact that users vary considerably in how the numerical score they assign to movies (e.g., one user's 3 may be more comparable to amother user's 1).
ratingsMeanCentered = ratingsFilled - mean(ratingsFilled,2);
Framing the Problem Using SVD
Next, let's think about how SVD might help us to analyze this dataset. Suppose we compute the SVD of the matrix ratingsMeanCentered. Let's use to refer to the first left singular vector, to refer to the first right singular vector, and to refer to the first singular value (let's assume that the first pair of singular vectors has the largest singular vector). Exercise
Before running any other code in this notebook, answer the following questions regarding the first pair of singular vectors.
- What are the sizes of and ? What do each of the dimensions of correspond to? How about each dimension of ?
- In 15.3.8 we talked about compressing the original matrix down to m + n + 1 values. If we think of as the compressed version of ratings data, how would we reconstruct the ratings data using (you essentially did this already, we're hoping you can recall this fact from earlier and apply it here).
- We can think of as encoding the dominant trend that explains the ratings of each movie. For this dataset, what might this correspond to?
- We can think of as encoding the dominant trend that explains the ratings by each usere. For this dataset, what might this correspond to? Keep in mind we have already subtracted out the mean of each user. It might be helpful to expand your formula from problem 2 to see how interact with each other.
Now we're going to compute the SVD. We'll just compute the 10 pairs of left and right singular vectors with the largest singular values.
[U, Sigma, V] = svds(ratingsMeanCentered, 10);
Examining the Right Singular Vectors
Now that we've computed our singular vectors, let's see if we can make sense of them. It turns out that the right singular vectors (the ones that have to do with movies) are generally more interpretable than the left singular vectors (the ones that have to do with users). We'll start out by looking at each right singular vector.
Exercise
Before running the code, think through the following question with your table-mates.
What might you do in order to make sense of what a particular right singular vector represents? Consider things like examining small or large values, looking for correlations, etc. There's not only one right answer, so throw out some ideas and try to think through what examining a particular aspect of the vector might tell you.
(we'll leave a little space to make it easier not to look at what we did)
Looking at Large and Small Values
One simple way to understand the right singular vectors is to look at the largest and smallest components of each vector. This will tell us which movies are either most strongly (positively) and most strongly (negatively) associated with this component. In the code below, we'll print out the title, genre, and component of the 10 movies that are most positively and most negatively associated with each right singular vector. Exercise: Based on these outputs, can you tell a story about what the singular vector represents?
disp(['Component ', num2str(i)]);
getHighAndLowMovies(V(:,i), movies)
end
ans = 20×3 cell
| 1 | 2 | 3 |
---|
1 | 'Usual Suspects, The (1995)' | 'Crime|Mystery|Thriller' | 0.0309 |
---|
2 | '12 Angry Men (1957)' | 'Drama' | 0.0310 |
---|
3 | 'Seven Samurai (Shichinin no samurai) (1954)' | 'Action|Adventure|Drama' | 0.0312 |
---|
4 | 'Pulp Fiction (1994)' | 'Comedy|Crime|Drama|Thriller' | 0.0318 |
---|
5 | 'Godfather: Part II, The (1974)' | 'Crime|Drama' | 0.0318 |
---|
6 | 'Band of Brothers (2001)' | 'Action|Drama|War' | 0.0349 |
---|
7 | 'Godfather, The (1972)' | 'Crime|Drama' | 0.0353 |
---|
8 | 'Shawshank Redemption, The (1994)' | 'Crime|Drama' | 0.0353 |
---|
9 | 'Planet Earth (2006)' | 'Documentary' | 0.0370 |
---|
10 | 'Planet Earth II (2016)' | 'Documentary' | 0.0375 |
---|
11 | 'Epic Movie (2007)' | 'Adventure|Comedy' | -0.0626 |
---|
12 | 'Kazaam (1996)' | 'Children|Comedy|Fantasy' | -0.0599 |
---|
13 | 'Battlefield Earth (2000)' | 'Action|Sci-Fi' | -0.0598 |
---|
14 | 'Baby Geniuses (1999)' | 'Comedy' | -0.0597 |
---|
15 | 'Dumb and Dumberer: When Harry Met Lloyd (2003)' | 'Comedy' | -0.0554 |
---|
16 | 'Mighty Morphin Power Rangers: The Movie (1995)' | 'Action|Children' | -0.0542 |
---|
17 | 'Lawnmower Man 2: Beyond Cyberspace (1996)' | 'Action|Sci-Fi|Thriller' | -0.0538 |
---|
18 | 'Police Academy 6: City Under Siege (1989)' | 'Comedy|Crime' | -0.0527 |
---|
19 | 'Problem Child 2 (1991)' | 'Comedy' | -0.0526 |
---|
20 | 'Home Alone 3 (1997)' | 'Children|Comedy' | -0.0523 |
---|
ans = 20×3 cell
| 1 | 2 | 3 |
---|
1 | 'Matrix Reloaded, The (2003)' | 'Action|Adventure|Sci-Fi|Thriller|IMAX' | 0.0741 |
---|
2 | 'Jurassic Park (1993)' | 'Action|Adventure|Sci-Fi|Thriller' | 0.0749 |
---|
3 | 'Star Wars: Episode II - Attack of the Clones (2002)' | 'Action|Adventure|Sci-Fi|IMAX' | 0.0765 |
---|
4 | 'Titanic (1997)' | 'Drama|Romance' | 0.0766 |
---|
5 | 'Shrek (2001)' | 'Adventure|Animation|Children|Comedy|Fantasy|Romance' | 0.0770 |
---|
6 | 'Armageddon (1998)' | 'Action|Romance|Sci-Fi|Thriller' | 0.0775 |
---|
7 | 'Men in Black (a.k.a. MIB) (1997)' | 'Action|Comedy|Sci-Fi' | 0.0783 |
---|
8 | 'Forrest Gump (1994)' | 'Comedy|Drama|Romance|War' | 0.0789 |
---|
9 | 'Star Wars: Episode I - The Phantom Menace (1999)' | 'Action|Adventure|Sci-Fi' | 0.0858 |
---|
10 | 'Independence Day (a.k.a. ID4) (1996)' | 'Action|Adventure|Sci-Fi|Thriller' | 0.0959 |
---|
11 | 'Solaris (Solyaris) (1972)' | 'Drama|Mystery|Sci-Fi' | -0.0167 |
---|
12 | 'Stuart Saves His Family (1995)' | 'Comedy' | -0.0167 |
---|
13 | 'Halloween III: Season of the Witch (1982)' | 'Horror' | -0.0166 |
---|
14 | 'Pink Flamingos (1972)' | 'Comedy' | -0.0165 |
---|
15 | 'Dead Ringers (1988)' | 'Drama|Horror|Thriller' | -0.0164 |
---|
16 | 'Even Cowgirls Get the Blues (1993)' | 'Comedy|Romance' | -0.0163 |
---|
17 | 'Mr. Wrong (1996)' | 'Comedy' | -0.0163 |
---|
18 | 'Alphaville (Alphaville, une étrange aventure de Lemmy Caution) (1965)' | 'Drama|Mystery|Romance|Sci-Fi|Thriller' | -0.0163 |
---|
19 | 'Grand Illusion (La grande illusion) (1937)' | 'Drama|War' | -0.0162 |
---|
20 | 'Girl 6 (1996)' | 'Comedy|Drama' | -0.0161 |
---|
ans = 20×3 cell
| 1 | 2 | 3 |
---|
1 | 'Thor (2011)' | 'Action|Adventure|Drama|Fantasy|IMAX' | 0.0568 |
---|
2 | 'Iron Man (2008)' | 'Action|Adventure|Sci-Fi' | 0.0572 |
---|
3 | 'Pirates of the Caribbean: Dead Man's Chest (2006)' | 'Action|Adventure|Fantasy' | 0.0583 |
---|
4 | 'X-Men: The Last Stand (2006)' | 'Action|Sci-Fi|Thriller' | 0.0616 |
---|
5 | 'Pirates of the Caribbean: At World's End (2007)' | 'Action|Adventure|Comedy|Fantasy' | 0.0619 |
---|
6 | 'Avengers, The (2012)' | 'Action|Adventure|Sci-Fi|IMAX' | 0.0636 |
---|
7 | 'Avatar (2009)' | 'Action|Adventure|Sci-Fi|IMAX' | 0.0639 |
---|
8 | 'Iron Man 2 (2010)' | 'Action|Adventure|Sci-Fi|Thriller|IMAX' | 0.0640 |
---|
9 | 'X-Men Origins: Wolverine (2009)' | 'Action|Sci-Fi|Thriller' | 0.0656 |
---|
10 | 'Transformers (2007)' | 'Action|Sci-Fi|Thriller|IMAX' | 0.0741 |
---|
11 | 'E.T. the Extra-Terrestrial (1982)' | 'Children|Drama|Sci-Fi' | -0.0559 |
---|
12 | 'Who Framed Roger Rabbit? (1988)' | 'Adventure|Animation|Children|Comedy|Crime|Fantasy|Mystery' | -0.0541 |
---|
13 | 'Big (1988)' | 'Comedy|Drama|Fantasy|Romance' | -0.0497 |
---|
14 | 'Jaws (1975)' | 'Action|Horror' | -0.0489 |
---|
15 | 'Ghostbusters (a.k.a. Ghost Busters) (1984)' | 'Action|Comedy|Sci-Fi' | -0.0488 |
---|
16 | 'Honey, I Shrunk the Kids (1989)' | 'Adventure|Children|Comedy|Fantasy|Sci-Fi' | -0.0465 |
---|
17 | 'Ghost (1990)' | 'Comedy|Drama|Fantasy|Romance|Thriller' | -0.0455 |
---|
18 | 'Dances with Wolves (1990)' | 'Adventure|Drama|Western' | -0.0452 |
---|
19 | 'Gremlins (1984)' | 'Comedy|Horror' | -0.0447 |
---|
20 | 'Beetlejuice (1988)' | 'Comedy|Fantasy' | -0.0444 |
---|
ans = 20×3 cell
| 1 | 2 | 3 |
---|
1 | 'Air Force One (1997)' | 'Action|Thriller' | 0.0539 |
---|
2 | 'Sleepless in Seattle (1993)' | 'Comedy|Drama|Romance' | 0.0542 |
---|
3 | 'Pearl Harbor (2001)' | 'Action|Drama|Romance|War' | 0.0543 |
---|
4 | 'You've Got Mail (1998)' | 'Comedy|Romance' | 0.0565 |
---|
5 | 'Twister (1996)' | 'Action|Adventure|Romance|Thriller' | 0.0594 |
---|
6 | 'Top Gun (1986)' | 'Action|Romance' | 0.0599 |
---|
7 | 'Ghost (1990)' | 'Comedy|Drama|Fantasy|Romance|Thriller' | 0.0600 |
---|
8 | 'Independence Day (a.k.a. ID4) (1996)' | 'Action|Adventure|Sci-Fi|Thriller' | 0.0714 |
---|
9 | 'Pretty Woman (1990)' | 'Comedy|Romance' | 0.0731 |
---|
10 | 'Armageddon (1998)' | 'Action|Romance|Sci-Fi|Thriller' | 0.0805 |
---|
11 | 'Clockwork Orange, A (1971)' | 'Crime|Drama|Sci-Fi|Thriller' | -0.1208 |
---|
12 | 'Pulp Fiction (1994)' | 'Comedy|Crime|Drama|Thriller' | -0.1203 |
---|
13 | '2001: A Space Odyssey (1968)' | 'Adventure|Drama|Sci-Fi' | -0.1052 |
---|
14 | 'Big Lebowski, The (1998)' | 'Comedy|Crime' | -0.0980 |
---|
15 | 'Being John Malkovich (1999)' | 'Comedy|Drama|Fantasy' | -0.0946 |
---|
16 | 'Shining, The (1980)' | 'Horror' | -0.0927 |
---|
17 | 'Royal Tenenbaums, The (2001)' | 'Comedy|Drama' | -0.0899 |
---|
18 | 'Fargo (1996)' | 'Comedy|Crime|Drama|Thriller' | -0.0896 |
---|
19 | 'Kill Bill: Vol. 1 (2003)' | 'Action|Crime|Thriller' | -0.0886 |
---|
20 | 'Taxi Driver (1976)' | 'Crime|Drama|Thriller' | -0.0853 |
---|
ans = 20×3 cell
| 1 | 2 | 3 |
---|
1 | 'Matrix, The (1999)' | 'Action|Sci-Fi|Thriller' | 0.0634 |
---|
2 | 'Fight Club (1999)' | 'Action|Crime|Drama|Thriller' | 0.0639 |
---|
3 | 'Rock, The (1996)' | 'Action|Adventure|Thriller' | 0.0645 |
---|
4 | 'Con Air (1997)' | 'Action|Adventure|Thriller' | 0.0646 |
---|
5 | 'The Devil's Advocate (1997)' | 'Drama|Mystery|Thriller' | 0.0651 |
---|
6 | 'Predator (1987)' | 'Action|Sci-Fi|Thriller' | 0.0660 |
---|
7 | 'Die Hard: With a Vengeance (1995)' | 'Action|Crime|Thriller' | 0.0698 |
---|
8 | 'Fifth Element, The (1997)' | 'Action|Adventure|Comedy|Sci-Fi' | 0.0710 |
---|
9 | 'From Dusk Till Dawn (1996)' | 'Action|Comedy|Horror|Thriller' | 0.0732 |
---|
10 | 'Starship Troopers (1997)' | 'Action|Sci-Fi' | 0.0743 |
---|
11 | 'Babe (1995)' | 'Children|Drama' | -0.1251 |
---|
12 | 'Beauty and the Beast (1991)' | 'Animation|Children|Fantasy|Musical|Romance|IMAX' | -0.1108 |
---|
13 | 'Wizard of Oz, The (1939)' | 'Adventure|Children|Fantasy|Musical' | -0.0969 |
---|
14 | 'Toy Story 2 (1999)' | 'Adventure|Animation|Children|Comedy|Fantasy' | -0.0902 |
---|
15 | 'Toy Story (1995)' | 'Adventure|Animation|Children|Comedy|Fantasy' | -0.0892 |
---|
16 | 'Snow White and the Seven Dwarfs (1937)' | 'Animation|Children|Drama|Fantasy|Musical' | -0.0871 |
---|
17 | 'Little Mermaid, The (1989)' | 'Animation|Children|Comedy|Musical|Romance' | -0.0849 |
---|
18 | 'Chicken Run (2000)' | 'Animation|Children|Comedy' | -0.0827 |
---|
19 | 'Finding Nemo (2003)' | 'Adventure|Animation|Children|Comedy' | -0.0814 |
---|
20 | 'Mary Poppins (1964)' | 'Children|Comedy|Fantasy|Musical' | -0.0805 |
---|
ans = 20×3 cell
| 1 | 2 | 3 |
---|
1 | 'Predator (1987)' | 'Action|Sci-Fi|Thriller' | 0.0670 |
---|
2 | 'Back to the Future (1985)' | 'Adventure|Comedy|Sci-Fi' | 0.0699 |
---|
3 | 'Die Hard (1988)' | 'Action|Crime|Thriller' | 0.0713 |
---|
4 | 'Terminator, The (1984)' | 'Action|Sci-Fi|Thriller' | 0.0728 |
---|
5 | 'Terminator 2: Judgment Day (1991)' | 'Action|Sci-Fi' | 0.0764 |
---|
6 | 'Star Wars: Episode IV - A New Hope (1977)' | 'Action|Adventure|Sci-Fi' | 0.0777 |
---|
7 | 'Raiders of the Lost Ark (Indiana Jones and the Raiders of the Lost Ark) (1981)' | 'Action|Adventure' | 0.0851 |
---|
8 | 'Star Wars: Episode V - The Empire Strikes Back (1980)' | 'Action|Adventure|Sci-Fi' | 0.0869 |
---|
9 | 'RoboCop (1987)' | 'Action|Crime|Drama|Sci-Fi|Thriller' | 0.0870 |
---|
10 | 'Ghostbusters (a.k.a. Ghost Busters) (1984)' | 'Action|Comedy|Sci-Fi' | 0.0904 |
---|
11 | 'American Beauty (1999)' | 'Drama|Romance' | -0.1038 |
---|
12 | 'Beautiful Mind, A (2001)' | 'Drama|Romance' | -0.0863 |
---|
13 | 'Crash (2004)' | 'Crime|Drama' | -0.0782 |
---|
14 | 'Good Will Hunting (1997)' | 'Drama|Romance' | -0.0740 |
---|
15 | 'Vanilla Sky (2001)' | 'Mystery|Romance|Sci-Fi|Thriller' | -0.0713 |
---|
16 | 'Forrest Gump (1994)' | 'Comedy|Drama|Romance|War' | -0.0709 |
---|
17 | 'American History X (1998)' | 'Crime|Drama' | -0.0691 |
---|
18 | 'As Good as It Gets (1997)' | 'Comedy|Drama|Romance' | -0.0678 |
---|
19 | 'Erin Brockovich (2000)' | 'Drama' | -0.0658 |
---|
20 | 'Dead Poets Society (1989)' | 'Drama' | -0.0658 |
---|
ans = 20×3 cell
| 1 | 2 | 3 |
---|
1 | 'Billy Madison (1995)' | 'Comedy' | 0.0849 |
---|
2 | 'Austin Powers in Goldmember (2002)' | 'Comedy' | 0.0850 |
---|
3 | 'Scary Movie (2000)' | 'Comedy|Horror' | 0.0876 |
---|
4 | 'Zoolander (2001)' | 'Comedy' | 0.0877 |
---|
5 | 'Austin Powers: The Spy Who Shagged Me (1999)' | 'Action|Adventure|Comedy' | 0.0932 |
---|
6 | 'Ace Ventura: When Nature Calls (1995)' | 'Comedy' | 0.0937 |
---|
7 | 'Dumb & Dumber (Dumb and Dumber) (1994)' | 'Adventure|Comedy' | 0.0948 |
---|
8 | 'Austin Powers: International Man of Mystery (1997)' | 'Action|Adventure|Comedy' | 0.0962 |
---|
9 | 'Happy Gilmore (1996)' | 'Comedy' | 0.1004 |
---|
10 | 'Ace Ventura: Pet Detective (1994)' | 'Comedy' | 0.1009 |
---|
11 | 'Titanic (1997)' | 'Drama|Romance' | -0.0953 |
---|
12 | 'Saving Private Ryan (1998)' | 'Action|Drama|War' | -0.0876 |
---|
13 | 'Star Wars: Episode I - The Phantom Menace (1999)' | 'Action|Adventure|Sci-Fi' | -0.0788 |
---|
14 | 'Dances with Wolves (1990)' | 'Adventure|Drama|Western' | -0.0777 |
---|
15 | 'Braveheart (1995)' | 'Action|Drama|War' | -0.0751 |
---|
16 | 'E.T. the Extra-Terrestrial (1982)' | 'Children|Drama|Sci-Fi' | -0.0749 |
---|
17 | 'Schindler's List (1993)' | 'Drama|War' | -0.0745 |
---|
18 | 'Star Wars: Episode IV - A New Hope (1977)' | 'Action|Adventure|Sci-Fi' | -0.0730 |
---|
19 | 'Jaws (1975)' | 'Action|Horror' | -0.0707 |
---|
20 | 'Jurassic Park (1993)' | 'Action|Adventure|Sci-Fi|Thriller' | -0.0695 |
---|
ans = 20×3 cell
| 1 | 2 | 3 |
---|
1 | 'Mrs. Doubtfire (1993)' | 'Comedy|Drama' | 0.0615 |
---|
2 | 'Braveheart (1995)' | 'Action|Drama|War' | 0.0620 |
---|
3 | 'Toy Story (1995)' | 'Adventure|Animation|Children|Comedy|Fantasy' | 0.0649 |
---|
4 | 'Shawshank Redemption, The (1994)' | 'Crime|Drama' | 0.0703 |
---|
5 | 'Titanic (1997)' | 'Drama|Romance' | 0.0760 |
---|
6 | 'Home Alone (1990)' | 'Children|Comedy' | 0.0766 |
---|
7 | 'Lion King, The (1994)' | 'Adventure|Animation|Children|Drama|Musical|IMAX' | 0.0769 |
---|
8 | 'Jurassic Park (1993)' | 'Action|Adventure|Sci-Fi|Thriller' | 0.0861 |
---|
9 | 'Back to the Future (1985)' | 'Adventure|Comedy|Sci-Fi' | 0.0873 |
---|
10 | 'Forrest Gump (1994)' | 'Comedy|Drama|Romance|War' | 0.1346 |
---|
11 | 'Lara Croft: Tomb Raider (2001)' | 'Action|Adventure' | -0.0782 |
---|
12 | 'Matrix Revolutions, The (2003)' | 'Action|Adventure|Sci-Fi|Thriller|IMAX' | -0.0741 |
---|
13 | 'Daredevil (2003)' | 'Action|Crime' | -0.0719 |
---|
14 | 'Charlie's Angels (2000)' | 'Action|Comedy' | -0.0700 |
---|
15 | 'Van Helsing (2004)' | 'Action|Adventure|Fantasy|Horror' | -0.0672 |
---|
16 | 'League of Extraordinary Gentlemen, The (a.k.a. LXG) (2003)' | 'Action|Fantasy|Sci-Fi' | -0.0670 |
---|
17 | 'Star Wars: Episode II - Attack of the Clones (2002)' | 'Action|Adventure|Sci-Fi|IMAX' | -0.0661 |
---|
18 | 'Fantastic Four (2005)' | 'Action|Adventure|Sci-Fi' | -0.0654 |
---|
19 | 'Matrix Reloaded, The (2003)' | 'Action|Adventure|Sci-Fi|Thriller|IMAX' | -0.0634 |
---|
20 | 'xXx (2002)' | 'Action|Crime|Thriller' | -0.0629 |
---|
ans = 20×3 cell
| 1 | 2 | 3 |
---|
1 | 'Harry Potter and the Prisoner of Azkaban (2004)' | 'Adventure|Fantasy|IMAX' | 0.0696 |
---|
2 | 'Spirited Away (Sen to Chihiro no kamikakushi) (2001)' | 'Adventure|Animation|Fantasy' | 0.0696 |
---|
3 | 'Harry Potter and the Chamber of Secrets (2002)' | 'Adventure|Fantasy' | 0.0773 |
---|
4 | 'Rocky Horror Picture Show, The (1975)' | 'Comedy|Horror|Musical|Sci-Fi' | 0.0799 |
---|
5 | 'Nightmare Before Christmas, The (1993)' | 'Animation|Children|Fantasy|Musical' | 0.0814 |
---|
6 | 'Harry Potter and the Sorcerer's Stone (a.k.a. Harry Potter and the Philosopher's Stone) (2001)' | 'Adventure|Children|Fantasy' | 0.0850 |
---|
7 | 'Lord of the Rings: The Return of the King, The (2003)' | 'Action|Adventure|Drama|Fantasy' | 0.0897 |
---|
8 | 'Lord of the Rings: The Two Towers, The (2002)' | 'Adventure|Fantasy' | 0.0947 |
---|
9 | 'Lord of the Rings: The Fellowship of the Ring, The (2001)' | 'Adventure|Fantasy' | 0.1043 |
---|
10 | 'Fifth Element, The (1997)' | 'Action|Adventure|Comedy|Sci-Fi' | 0.1046 |
---|
11 | 'Dumb & Dumber (Dumb and Dumber) (1994)' | 'Adventure|Comedy' | -0.1263 |
---|
12 | 'There's Something About Mary (1998)' | 'Comedy|Romance' | -0.1177 |
---|
13 | 'American Pie (1999)' | 'Comedy|Romance' | -0.1175 |
---|
14 | 'Meet the Parents (2000)' | 'Comedy' | -0.0956 |
---|
15 | 'Austin Powers: International Man of Mystery (1997)' | 'Action|Adventure|Comedy' | -0.0907 |
---|
16 | 'Jaws (1975)' | 'Action|Horror' | -0.0817 |
---|
17 | 'Happy Gilmore (1996)' | 'Comedy' | -0.0767 |
---|
18 | 'Austin Powers: The Spy Who Shagged Me (1999)' | 'Action|Adventure|Comedy' | -0.0750 |
---|
19 | 'Rocky (1976)' | 'Drama' | -0.0738 |
---|
20 | 'Ace Ventura: Pet Detective (1994)' | 'Comedy' | -0.0735 |
---|
ans = 20×3 cell
| 1 | 2 | 3 |
---|
1 | 'Princess Bride, The (1987)' | 'Action|Adventure|Comedy|Fantasy|Romance' | 0.0837 |
---|
2 | 'Star Wars: Episode III - Revenge of the Sith (2005)' | 'Action|Adventure|Sci-Fi' | 0.0949 |
---|
3 | 'Star Wars: Episode I - The Phantom Menace (1999)' | 'Action|Adventure|Sci-Fi' | 0.0965 |
---|
4 | 'Star Wars: Episode II - Attack of the Clones (2002)' | 'Action|Adventure|Sci-Fi|IMAX' | 0.1111 |
---|
5 | 'Lord of the Rings: The Return of the King, The (2003)' | 'Action|Adventure|Drama|Fantasy' | 0.1140 |
---|
6 | 'Lord of the Rings: The Two Towers, The (2002)' | 'Adventure|Fantasy' | 0.1245 |
---|
7 | 'Star Wars: Episode V - The Empire Strikes Back (1980)' | 'Action|Adventure|Sci-Fi' | 0.1296 |
---|
8 | 'Lord of the Rings: The Fellowship of the Ring, The (2001)' | 'Adventure|Fantasy' | 0.1328 |
---|
9 | 'Star Wars: Episode VI - Return of the Jedi (1983)' | 'Action|Adventure|Sci-Fi' | 0.1422 |
---|
10 | 'Star Wars: Episode IV - A New Hope (1977)' | 'Action|Adventure|Sci-Fi' | 0.1425 |
---|
11 | 'Titanic (1997)' | 'Drama|Romance' | -0.1113 |
---|
12 | 'Eyes Wide Shut (1999)' | 'Drama|Mystery|Thriller' | -0.0798 |
---|
13 | 'Home Alone (1990)' | 'Children|Comedy' | -0.0791 |
---|
14 | 'Home Alone 2: Lost in New York (1992)' | 'Children|Comedy' | -0.0784 |
---|
15 | 'Speed (1994)' | 'Action|Romance|Thriller' | -0.0714 |
---|
16 | 'Jumanji (1995)' | 'Adventure|Children|Fantasy' | -0.0708 |
---|
17 | 'Fly, The (1986)' | 'Drama|Horror|Sci-Fi|Thriller' | -0.0675 |
---|
18 | 'Face/Off (1997)' | 'Action|Crime|Drama|Thriller' | -0.0653 |
---|
19 | 'Final Destination (2000)' | 'Drama|Thriller' | -0.0644 |
---|
20 | 'American Psycho (2000)' | 'Crime|Horror|Mystery|Thriller' | -0.0639 |
---|
Examining the Left Singular Vectors
Now we're going to check out the left singular vectors.
Exercise
Before running the code, think through the following question with your table-mates.
What might you do in order to make sense of what a particular left singular vector represents? Consider things like examining small or large values, looking for correlations, etc. There's not only one right answer, so throw out some ideas and try to think through what examining a particular aspect of the vector might tell you.
(we'll leave a little space to make it easier not to look at what we did)
Looking at Large and Small Values
Similarly to what we did for the right singular vectors, let's take a look at large (positive) and small (negative) components of each singular vector. Instead of looking at the top 10 and bottom 10, we're instead going to look at the single highest and single lowest component (each of which correspond to a user). For that user, we're going to show a sampling of movies that the user rated (focusing on the top 10 and bottom 10 ratings for that particular user). Exercise: Given what you know about the corresponding right singular vector, try to make sense of the users that are at either extreme of the left singular vectors.
[~, highestUserIndex] = max(U(:,i));
[~, lowestUserIndex] = min(U(:,i));
disp(['Component ', num2str(i)]);
disp('The user with the largest component rated the following movies as high and low');
getHighAndLowUserRatings(highestUserIndex, movies, ratings)
disp('The user with the smallest (probably negative) component rated the following movies as high and low');
getHighAndLowUserRatings(lowestUserIndex, movies, ratings)
end
The user with the largest component rated the following movies as high and low
ans = 20×2 cell
| 1 | 2 |
---|
1 | 'Thor: Ragnarok (2017)' | 5 |
---|
2 | 'Louis C.K.: Live at The Comedy Store (2015)' | 5 |
---|
3 | 'Tomorrowland (2015)' | 5 |
---|
4 | 'Inside Out (2015)' | 5 |
---|
5 | 'Creed (2015)' | 5 |
---|
6 | 'Hunt for the Wilderpeople (2016)' | 5 |
---|
7 | 'Planet Earth (2006)' | 5 |
---|
8 | 'The Lego Batman Movie (2017)' | 5 |
---|
9 | 'Baby Driver (2017)' | 5 |
---|
10 | 'The Shape of Water (2017)' | 5 |
---|
11 | 'Dracula: Dead and Loving It (1995)' | 0.5000 |
---|
12 | 'Cutthroat Island (1995)' | 0.5000 |
---|
13 | 'Four Rooms (1995)' | 0.5000 |
---|
14 | 'Mortal Kombat (1995)' | 0.5000 |
---|
15 | 'Don't Be a Menace to South Central While Drinking Your Juice in the Hood (1996)' | 0.5000 |
---|
16 | 'Two if by Sea (1996)' | 0.5000 |
---|
17 | 'Bio-Dome (1996)' | 0.5000 |
---|
18 | 'Lawnmower Man 2: Beyond Cyberspace (1996)' | 0.5000 |
---|
19 | 'Fair Game (1995)' | 0.5000 |
---|
20 | 'Mary Reilly (1996)' | 0.5000 |
---|
The user with the smallest (probably negative) component rated the following movies as high and low
ans = 20×2 cell
| 1 | 2 |
---|
1 | 'Law Abiding Citizen (2009)' | 5 |
---|
2 | 'Avatar (2009)' | 5 |
---|
3 | 'Sherlock Holmes (2009)' | 5 |
---|
4 | 'Shutter Island (2010)' | 5 |
---|
5 | 'Inception (2010)' | 5 |
---|
6 | 'Expendables, The (2010)' | 5 |
---|
7 | '127 Hours (2010)' | 5 |
---|
8 | 'Iron Man 3 (2013)' | 5 |
---|
9 | 'Interstellar (2014)' | 5 |
---|
10 | 'Furious 7 (2015)' | 5 |
---|
11 | 'Father of the Bride Part II (1995)' | 0.5000 |
---|
12 | 'Heat (1995)' | 0.5000 |
---|
13 | 'Sudden Death (1995)' | 0.5000 |
---|
14 | 'GoldenEye (1995)' | 0.5000 |
---|
15 | 'Dracula: Dead and Loving It (1995)' | 0.5000 |
---|
16 | 'Cutthroat Island (1995)' | 0.5000 |
---|
17 | 'Get Shorty (1995)' | 0.5000 |
---|
18 | 'Babe (1995)' | 0.5000 |
---|
19 | 'Bed of Roses (1996)' | 0.5000 |
---|
20 | 'Hate (Haine, La) (1995)' | 0.5000 |
---|
The user with the largest component rated the following movies as high and low
ans = 20×2 cell
| 1 | 2 |
---|
1 | 'The Intern (2015)' | 5 |
---|
2 | 'Straight Outta Compton (2015)' | 5 |
---|
3 | 'Everest (2015)' | 5 |
---|
4 | 'Hotel Transylvania 2 (2015)' | 5 |
---|
5 | 'Creed (2015)' | 5 |
---|
6 | 'Finding Dory (2016)' | 5 |
---|
7 | 'Captain Fantastic (2016)' | 5 |
---|
8 | 'Hacksaw Ridge (2016)' | 5 |
---|
9 | 'Arrival (2016)' | 5 |
---|
10 | 'Rogue One: A Star Wars Story (2016)' | 5 |
---|
11 | 'Doom (2005)' | 0.5000 |
---|
12 | 'Diving Bell and the Butterfly, The (Scaphandre et le papillon, Le) (2007)' | 0.5000 |
---|
13 | 'M*A*S*H (a.k.a. MASH) (1970)' | 1.5000 |
---|
14 | 'Blair Witch Project, The (1999)' | 2 |
---|
15 | 'Get Him to the Greek (2010)' | 2 |
---|
16 | 'Puss in Boots (2011)' | 2 |
---|
17 | 'Cat in the Hat, The (2003)' | 3 |
---|
18 | 'Terminal, The (2004)' | 3 |
---|
19 | 'Charlie and the Chocolate Factory (2005)' | 3 |
---|
20 | 'Bewitched (2005)' | 3 |
---|
The user with the smallest (probably negative) component rated the following movies as high and low
ans = 20×2 cell
| 1 | 2 |
---|
1 | 'Matrix, The (1999)' | 5 |
---|
2 | 'Eyes Wide Shut (1999)' | 5 |
---|
3 | 'Lord of the Rings: The Fellowship of the Ring, The (2001)' | 5 |
---|
4 | 'Kill Bill: Vol. 1 (2003)' | 5 |
---|
5 | 'Kill Bill: Vol. 2 (2004)' | 5 |
---|
6 | 'No Country for Old Men (2007)' | 5 |
---|
7 | 'There Will Be Blood (2007)' | 5 |
---|
8 | 'Grand Budapest Hotel, The (2014)' | 5 |
---|
9 | 'Interstellar (2014)' | 5 |
---|
10 | 'Dunkirk (2017)' | 5 |
---|
11 | 'Grumpier Old Men (1995)' | 0.5000 |
---|
12 | 'Babe (1995)' | 0.5000 |
---|
13 | 'Mortal Kombat (1995)' | 0.5000 |
---|
14 | 'Batman Forever (1995)' | 0.5000 |
---|
15 | 'Casper (1995)' | 0.5000 |
---|
16 | 'Congo (1995)' | 0.5000 |
---|
17 | 'Judge Dredd (1995)' | 0.5000 |
---|
18 | 'Demolition Man (1993)' | 0.5000 |
---|
19 | 'Last Action Hero (1993)' | 0.5000 |
---|
20 | 'Dead Man (1995)' | 0.5000 |
---|
The user with the largest component rated the following movies as high and low
ans = 20×2 cell
| 1 | 2 |
---|
1 | 'Star Wars: Episode V - The Empire Strikes Back (1980)' | 5 |
---|
2 | 'Raiders of the Lost Ark (Indiana Jones and the Raiders of the Lost Ark) (1981)' | 5 |
---|
3 | 'Lawrence of Arabia (1962)' | 5 |
---|
4 | 'Star Wars: Episode VI - Return of the Jedi (1983)' | 5 |
---|
5 | 'Groundhog Day (1993)' | 5 |
---|
6 | 'Ben-Hur (1959)' | 5 |
---|
7 | 'Hunt for Red October, The (1990)' | 5 |
---|
8 | 'Saving Private Ryan (1998)' | 5 |
---|
9 | 'Ronin (1998)' | 5 |
---|
10 | 'Goldfinger (1964)' | 5 |
---|
11 | 'Grumpier Old Men (1995)' | 1 |
---|
12 | 'Waiting to Exhale (1995)' | 1 |
---|
13 | 'Father of the Bride Part II (1995)' | 1 |
---|
14 | 'Heat (1995)' | 1 |
---|
15 | 'Sudden Death (1995)' | 1 |
---|
16 | 'Dracula: Dead and Loving It (1995)' | 1 |
---|
17 | 'Cutthroat Island (1995)' | 1 |
---|
18 | 'Money Train (1995)' | 1 |
---|
19 | 'Get Shorty (1995)' | 1 |
---|
20 | 'Copycat (1995)' | 1 |
---|
The user with the smallest (probably negative) component rated the following movies as high and low
ans = 20×2 cell
| 1 | 2 |
---|
1 | '2046 (2004)' | 5 |
---|
2 | 'Old Boy (2003)' | 5 |
---|
3 | 'Apocalypto (2006)' | 5 |
---|
4 | 'Pan's Labyrinth (Laberinto del fauno, El) (2006)' | 5 |
---|
5 | 'There Will Be Blood (2007)' | 5 |
---|
6 | 'Let the Right One In (Låt den rätte komma in) (2008)' | 5 |
---|
7 | 'I Saw the Devil (Akmareul boatda) (2010)' | 5 |
---|
8 | 'Mission: Impossible - Ghost Protocol (2011)' | 5 |
---|
9 | 'Mission: Impossible - Rogue Nation (2015)' | 5 |
---|
10 | 'Mad Max: Fury Road (2015)' | 5 |
---|
11 | 'Congo (1995)' | 0.5000 |
---|
12 | 'Coneheads (1993)' | 0.5000 |
---|
13 | 'Demolition Man (1993)' | 0.5000 |
---|
14 | 'RoboCop 3 (1993)' | 0.5000 |
---|
15 | 'Barb Wire (1996)' | 0.5000 |
---|
16 | 'Jack (1996)' | 0.5000 |
---|
17 | 'Nutty Professor, The (1996)' | 0.5000 |
---|
18 | 'Batman & Robin (1997)' | 0.5000 |
---|
19 | 'Spawn (1997)' | 0.5000 |
---|
20 | 'Flubber (1997)' | 0.5000 |
---|
The user with the largest component rated the following movies as high and low
ans = 20×2 cell
| 1 | 2 |
---|
1 | 'Mission: Impossible - Ghost Protocol (2011)' | 5 |
---|
2 | 'Impossible, The (Imposible, Lo) (2012)' | 5 |
---|
3 | 'Star Trek Into Darkness (2013)' | 5 |
---|
4 | 'Man of Steel (2013)' | 5 |
---|
5 | 'Godzilla (2014)' | 5 |
---|
6 | 'Guardians of the Galaxy (2014)' | 5 |
---|
7 | 'Star Wars: Episode VII - The Force Awakens (2015)' | 5 |
---|
8 | 'The Age of Adaline (2015)' | 5 |
---|
9 | 'The Man from U.N.C.L.E. (2015)' | 5 |
---|
10 | 'Dunkirk (2017)' | 5 |
---|
11 | 'Heat (1995)' | 0.5000 |
---|
12 | 'Nixon (1995)' | 0.5000 |
---|
13 | 'Casino (1995)' | 0.5000 |
---|
14 | 'Get Shorty (1995)' | 0.5000 |
---|
15 | 'Copycat (1995)' | 0.5000 |
---|
16 | 'To Die For (1995)' | 0.5000 |
---|
17 | 'Seven (a.k.a. Se7en) (1995)' | 0.5000 |
---|
18 | 'Usual Suspects, The (1995)' | 0.5000 |
---|
19 | 'Mighty Aphrodite (1995)' | 0.5000 |
---|
20 | 'From Dusk Till Dawn (1996)' | 0.5000 |
---|
The user with the smallest (probably negative) component rated the following movies as high and low
ans = 20×2 cell
| 1 | 2 |
---|
1 | 'Primer (2004)' | 5 |
---|
2 | 'Sideways (2004)' | 5 |
---|
3 | 'Incredibles, The (2004)' | 5 |
---|
4 | 'Battle of Algiers, The (La battaglia di Algeri) (1966)' | 5 |
---|
5 | 'Aviator, The (2004)' | 5 |
---|
6 | 'Sin City (2005)' | 5 |
---|
7 | 'Batman Begins (2005)' | 5 |
---|
8 | 'Constant Gardener, The (2005)' | 5 |
---|
9 | 'Lord of War (2005)' | 5 |
---|
10 | 'Weather Man, The (2005)' | 5 |
---|
11 | 'Dracula: Dead and Loving It (1995)' | 0.5000 |
---|
12 | 'Cutthroat Island (1995)' | 0.5000 |
---|
13 | 'Mr. Holland's Opus (1995)' | 0.5000 |
---|
14 | 'Bio-Dome (1996)' | 0.5000 |
---|
15 | 'Screamers (1995)' | 0.5000 |
---|
16 | 'Happy Gilmore (1996)' | 0.5000 |
---|
17 | 'Muppet Treasure Island (1996)' | 0.5000 |
---|
18 | 'Braveheart (1995)' | 0.5000 |
---|
19 | 'Down Periscope (1996)' | 0.5000 |
---|
20 | 'Bad Boys (1995)' | 0.5000 |
---|
The user with the largest component rated the following movies as high and low
ans = 20×2 cell
| 1 | 2 |
---|
1 | 'Wolf of Wall Street, The (2013)' | 5 |
---|
2 | 'American Hustle (2013)' | 5 |
---|
3 | 'Interstellar (2014)' | 5 |
---|
4 | 'The Expendables 3 (2014)' | 5 |
---|
5 | 'John Wick (2014)' | 5 |
---|
6 | 'Nightcrawler (2014)' | 5 |
---|
7 | 'Mad Max: Fury Road (2015)' | 5 |
---|
8 | 'The Hateful Eight (2015)' | 5 |
---|
9 | 'Big Short, The (2015)' | 5 |
---|
10 | 'John Wick: Chapter Two (2017)' | 5 |
---|
11 | 'Toy Story (1995)' | 0.5000 |
---|
12 | 'Grumpier Old Men (1995)' | 0.5000 |
---|
13 | 'Father of the Bride Part II (1995)' | 0.5000 |
---|
14 | 'Sabrina (1995)' | 0.5000 |
---|
15 | 'Nixon (1995)' | 0.5000 |
---|
16 | 'Get Shorty (1995)' | 0.5000 |
---|
17 | 'Babe (1995)' | 0.5000 |
---|
18 | 'Pocahontas (1995)' | 0.5000 |
---|
19 | 'Mr. Holland's Opus (1995)' | 0.5000 |
---|
20 | 'Bio-Dome (1996)' | 0.5000 |
---|
The user with the smallest (probably negative) component rated the following movies as high and low
ans = 20×2 cell
| 1 | 2 |
---|
1 | 'Shrek 2 (2004)' | 5 |
---|
2 | 'Before Sunset (2004)' | 5 |
---|
3 | 'Finding Neverland (2004)' | 5 |
---|
4 | 'Charlie Brown Christmas, A (1965)' | 5 |
---|
5 | 'Million Dollar Baby (2004)' | 5 |
---|
6 | 'Hotel Rwanda (2004)' | 5 |
---|
7 | 'Notes on a Scandal (2006)' | 5 |
---|
8 | 'How the Grinch Stole Christmas! (1966)' | 5 |
---|
9 | 'Juno (2007)' | 5 |
---|
10 | 'Toy Story 3 (2010)' | 5 |
---|
11 | 'Natural Born Killers (1994)' | 0.5000 |
---|
12 | 'Stargate (1994)' | 0.5000 |
---|
13 | 'Ace Ventura: Pet Detective (1994)' | 0.5000 |
---|
14 | 'Mask, The (1994)' | 0.5000 |
---|
15 | 'Threesome (1994)' | 0.5000 |
---|
16 | 'Bad Taste (1987)' | 0.5000 |
---|
17 | 'Beneath the Planet of the Apes (1970)' | 0.5000 |
---|
18 | 'Hellbound: Hellraiser II (1988)' | 0.5000 |
---|
19 | 'Legend of Drunken Master, The (Jui kuen II) (1994)' | 0.5000 |
---|
20 | 'My Neighbor Totoro (Tonari no Totoro) (1988)' | 0.5000 |
---|
The user with the largest component rated the following movies as high and low
ans = 20×2 cell
| 1 | 2 |
---|
1 | 'Grand Budapest Hotel, The (2014)' | 5 |
---|
2 | 'Captain America: The Winter Soldier (2014)' | 5 |
---|
3 | 'Predestination (2014)' | 5 |
---|
4 | 'John Wick (2014)' | 5 |
---|
5 | 'Big Hero 6 (2014)' | 5 |
---|
6 | 'Kingsman: The Secret Service (2015)' | 5 |
---|
7 | 'Deadpool (2016)' | 5 |
---|
8 | 'Tomorrowland (2015)' | 5 |
---|
9 | 'Inside Out (2015)' | 5 |
---|
10 | 'Sicario (2015)' | 5 |
---|
11 | 'Jumanji (1995)' | 0.5000 |
---|
12 | 'Grumpier Old Men (1995)' | 0.5000 |
---|
13 | 'Waiting to Exhale (1995)' | 0.5000 |
---|
14 | 'Tom and Huck (1995)' | 0.5000 |
---|
15 | 'Sudden Death (1995)' | 0.5000 |
---|
16 | 'Nixon (1995)' | 0.5000 |
---|
17 | 'Cutthroat Island (1995)' | 0.5000 |
---|
18 | 'Ace Ventura: When Nature Calls (1995)' | 0.5000 |
---|
19 | 'Othello (1995)' | 0.5000 |
---|
20 | 'Now and Then (1995)' | 0.5000 |
---|
The user with the smallest (probably negative) component rated the following movies as high and low
ans = 20×2 cell
| 1 | 2 |
---|
1 | 'Seven (a.k.a. Se7en) (1995)' | 5 |
---|
2 | 'Silence of the Lambs, The (1991)' | 5 |
---|
3 | 'Pretty Woman (1990)' | 5 |
---|
4 | 'E.T. the Extra-Terrestrial (1982)' | 5 |
---|
5 | 'Blair Witch Project, The (1999)' | 5 |
---|
6 | 'Boys Don't Cry (1999)' | 5 |
---|
7 | 'Fight Club (1999)' | 5 |
---|
8 | 'Coyote Ugly (2000)' | 5 |
---|
9 | 'Memento (2000)' | 5 |
---|
10 | 'Donnie Darko (2001)' | 5 |
---|
11 | 'Judge Dredd (1995)' | 0.5000 |
---|
12 | 'Naked Gun 33 1/3: The Final Insult (1994)' | 0.5000 |
---|
13 | 'Hot Shots! Part Deux (1993)' | 0.5000 |
---|
14 | 'Kingpin (1996)' | 0.5000 |
---|
15 | 'Maltese Falcon, The (1941)' | 0.5000 |
---|
16 | 'Ninotchka (1939)' | 0.5000 |
---|
17 | 'Jean de Florette (1986)' | 0.5000 |
---|
18 | 'Willow (1988)' | 0.5000 |
---|
19 | 'Airplane! (1980)' | 0.5000 |
---|
20 | 'Airplane II: The Sequel (1982)' | 0.5000 |
---|
The user with the largest component rated the following movies as high and low
ans = 20×2 cell
| 1 | 2 |
---|
1 | 'Big Hero 6 (2014)' | 5 |
---|
2 | 'The Hobbit: The Battle of the Five Armies (2014)' | 5 |
---|
3 | 'Kingsman: The Secret Service (2015)' | 5 |
---|
4 | 'Mad Max: Fury Road (2015)' | 5 |
---|
5 | 'Star Wars: Episode VII - The Force Awakens (2015)' | 5 |
---|
6 | 'Avengers: Age of Ultron (2015)' | 5 |
---|
7 | 'Furious 7 (2015)' | 5 |
---|
8 | 'Kung Fury (2015)' | 5 |
---|
9 | 'Spectre (2015)' | 5 |
---|
10 | 'The Man from U.N.C.L.E. (2015)' | 5 |
---|
11 | 'There Will Be Blood (2007)' | 2.5000 |
---|
12 | 'Beautician and the Beast, The (1997)' | 3 |
---|
13 | 'Spice World (1997)' | 3 |
---|
14 | 'Cube (1997)' | 3 |
---|
15 | 'Blame It on Rio (1984)' | 3 |
---|
16 | 'Go (1999)' | 3 |
---|
17 | 'Adaptation (2002)' | 3 |
---|
18 | '25th Hour (2002)' | 3 |
---|
19 | 'Barton Fink (1991)' | 3 |
---|
20 | 'Pieces of April (2003)' | 3 |
---|
The user with the smallest (probably negative) component rated the following movies as high and low
ans = 20×2 cell
| 1 | 2 |
---|
1 | 'Grand Budapest Hotel, The (2014)' | 5 |
---|
2 | 'X-Men: Days of Future Past (2014)' | 5 |
---|
3 | 'Ex Machina (2015)' | 5 |
---|
4 | 'Avengers: Age of Ultron (2015)' | 5 |
---|
5 | 'Avengers: Infinity War - Part I (2018)' | 5 |
---|
6 | 'Avengers: Infinity War - Part II (2019)' | 5 |
---|
7 | 'X-Men: Apocalypse (2016)' | 5 |
---|
8 | 'The Hateful Eight (2015)' | 5 |
---|
9 | 'The Handmaiden (2016)' | 5 |
---|
10 | 'Mission: Impossible - Fallout (2018)' | 5 |
---|
11 | 'Toy Story (1995)' | 0.5000 |
---|
12 | 'Sense and Sensibility (1995)' | 0.5000 |
---|
13 | 'Ace Ventura: When Nature Calls (1995)' | 0.5000 |
---|
14 | 'Get Shorty (1995)' | 0.5000 |
---|
15 | 'Babe (1995)' | 0.5000 |
---|
16 | 'Clueless (1995)' | 0.5000 |
---|
17 | 'To Die For (1995)' | 0.5000 |
---|
18 | 'Pocahontas (1995)' | 0.5000 |
---|
19 | 'Mighty Aphrodite (1995)' | 0.5000 |
---|
20 | 'Batman Forever (1995)' | 0.5000 |
---|
The user with the largest component rated the following movies as high and low
ans = 20×2 cell
| 1 | 2 |
---|
1 | 'Wolf of Wall Street, The (2013)' | 4.5000 |
---|
2 | 'Whiplash (2014)' | 4.5000 |
---|
3 | 'The Revenant (2015)' | 4.5000 |
---|
4 | 'Rogue One: A Star Wars Story (2016)' | 4.5000 |
---|
5 | 'Casino (1995)' | 5 |
---|
6 | 'Star Wars: Episode IV - A New Hope (1977)' | 5 |
---|
7 | 'Carlito's Way (1993)' | 5 |
---|
8 | 'Star Wars: Episode V - The Empire Strikes Back (1980)' | 5 |
---|
9 | 'Goodfellas (1990)' | 5 |
---|
10 | 'Donnie Brasco (1997)' | 5 |
---|
11 | 'Toy Story (1995)' | 0.5000 |
---|
12 | 'Grumpier Old Men (1995)' | 0.5000 |
---|
13 | 'Father of the Bride Part II (1995)' | 0.5000 |
---|
14 | 'Sabrina (1995)' | 0.5000 |
---|
15 | 'Dracula: Dead and Loving It (1995)' | 0.5000 |
---|
16 | 'Sense and Sensibility (1995)' | 0.5000 |
---|
17 | 'Get Shorty (1995)' | 0.5000 |
---|
18 | 'Babe (1995)' | 0.5000 |
---|
19 | 'Mortal Kombat (1995)' | 0.5000 |
---|
20 | 'To Die For (1995)' | 0.5000 |
---|
The user with the smallest (probably negative) component rated the following movies as high and low
ans = 20×2 cell
| 1 | 2 |
---|
1 | 'Thor: Ragnarok (2017)' | 5 |
---|
2 | 'Guardians of the Galaxy 2 (2017)' | 5 |
---|
3 | 'Captain America: Civil War (2016)' | 5 |
---|
4 | 'Doctor Strange (2016)' | 5 |
---|
5 | 'X-Men: Apocalypse (2016)' | 5 |
---|
6 | 'Untitled Spider-Man Reboot (2017)' | 5 |
---|
7 | 'Batman v Superman: Dawn of Justice (2016)' | 5 |
---|
8 | 'The Man from U.N.C.L.E. (2015)' | 5 |
---|
9 | 'Wonder Woman (2017)' | 5 |
---|
10 | 'Incredibles 2 (2018)' | 5 |
---|
11 | 'Bullets Over Broadway (1994)' | 0.5000 |
---|
12 | 'Trainspotting (1996)' | 0.5000 |
---|
13 | 'Perfect Storm, The (2000)' | 0.5000 |
---|
14 | 'Requiem for a Dream (2000)' | 0.5000 |
---|
15 | 'Holiday, The (2006)' | 0.5000 |
---|
16 | 'Bridge to Terabithia (2007)' | 0.5000 |
---|
17 | 'Road, The (2009)' | 0.5000 |
---|
18 | '(500) Days of Summer (2009)' | 0.5000 |
---|
19 | 'Time Traveler's Wife, The (2009)' | 0.5000 |
---|
20 | 'Kids Are All Right, The (2010)' | 0.5000 |
---|
The user with the largest component rated the following movies as high and low
ans = 20×2 cell
| 1 | 2 |
---|
1 | 'Twilight Saga: New Moon, The (2009)' | 5 |
---|
2 | 'Princess and the Frog, The (2009)' | 5 |
---|
3 | 'Avatar (2009)' | 5 |
---|
4 | 'How to Train Your Dragon (2010)' | 5 |
---|
5 | 'Clash of the Titans (2010)' | 5 |
---|
6 | 'Inception (2010)' | 5 |
---|
7 | 'Tron: Legacy (2010)' | 5 |
---|
8 | 'Source Code (2011)' | 5 |
---|
9 | 'Puss in Boots (2011)' | 5 |
---|
10 | 'Prometheus (2012)' | 5 |
---|
11 | 'Usual Suspects, The (1995)' | 0.5000 |
---|
12 | 'Fair Game (1995)' | 0.5000 |
---|
13 | 'Young Poisoner's Handbook, The (1995)' | 0.5000 |
---|
14 | 'Jury Duty (1995)' | 0.5000 |
---|
15 | 'One Flew Over the Cuckoo's Nest (1975)' | 0.5000 |
---|
16 | 'Dead Poets Society (1989)' | 0.5000 |
---|
17 | 'Shall We Dance? (Shall We Dansu?) (1996)' | 0.5000 |
---|
18 | 'Out of the Past (1947)' | 0.5000 |
---|
19 | 'Police Academy 3: Back in Training (1986)' | 0.5000 |
---|
20 | 'Police Academy 4: Citizens on Patrol (1987)' | 0.5000 |
---|
The user with the smallest (probably negative) component rated the following movies as high and low
ans = 20×2 cell
| 1 | 2 |
---|
1 | 'City Slickers (1991)' | 5 |
---|
2 | '48 Hrs. (1982)' | 5 |
---|
3 | 'Smokey and the Bandit (1977)' | 5 |
---|
4 | 'Mystic River (2003)' | 5 |
---|
5 | 'Presumed Innocent (1990)' | 5 |
---|
6 | 'History of Violence, A (2005)' | 5 |
---|
7 | 'No Country for Old Men (2007)' | 5 |
---|
8 | 'Taken (2008)' | 5 |
---|
9 | 'True Grit (2010)' | 5 |
---|
10 | 'Hidden Figures (2016)' | 5 |
---|
11 | 'From Dusk Till Dawn (1996)' | 0.5000 |
---|
12 | 'Showgirls (1995)' | 0.5000 |
---|
13 | 'Tommy Boy (1995)' | 0.5000 |
---|
14 | 'Widows' Peak (1994)' | 0.5000 |
---|
15 | 'Last Emperor, The (1987)' | 0.5000 |
---|
16 | 'Lupin III: The Castle Of Cagliostro (Rupan sansei: Kariosutoro no shiro) (1979)' | 0.5000 |
---|
17 | 'Shoot 'Em Up (2007)' | 0.5000 |
---|
18 | 'Sorcerer's Apprentice, The (2010)' | 0.5000 |
---|
19 | 'Into the Woods (2014)' | 0.5000 |
---|
20 | 'Straight Outta Compton (2015)' | 0.5000 |
---|
The user with the largest component rated the following movies as high and low
ans = 20×2 cell
| 1 | 2 |
---|
1 | 'Lookout, The (2007)' | 5 |
---|
2 | 'Transformers (2007)' | 5 |
---|
3 | 'Stardust (2007)' | 5 |
---|
4 | 'Bourne Ultimatum, The (2007)' | 5 |
---|
5 | 'Superbad (2007)' | 5 |
---|
6 | 'Dan in Real Life (2007)' | 5 |
---|
7 | 'Enchanted (2007)' | 5 |
---|
8 | 'Juno (2007)' | 5 |
---|
9 | 'In Bruges (2008)' | 5 |
---|
10 | 'Forgetting Sarah Marshall (2008)' | 5 |
---|
11 | 'Ace Ventura: When Nature Calls (1995)' | 0.5000 |
---|
12 | 'Copycat (1995)' | 0.5000 |
---|
13 | 'Congo (1995)' | 0.5000 |
---|
14 | 'Beverly Hillbillies, The (1993)' | 0.5000 |
---|
15 | 'City Slickers II: The Legend of Curly's Gold (1994)' | 0.5000 |
---|
16 | 'Fatal Instinct (1993)' | 0.5000 |
---|
17 | 'RoboCop 3 (1993)' | 0.5000 |
---|
18 | 'Super Mario Bros. (1993)' | 0.5000 |
---|
19 | 'Space Jam (1996)' | 0.5000 |
---|
20 | 'Home Alone 2: Lost in New York (1992)' | 0.5000 |
---|
The user with the smallest (probably negative) component rated the following movies as high and low
ans = 20×2 cell
| 1 | 2 |
---|
1 | 'X-Men: The Last Stand (2006)' | 5 |
---|
2 | 'Pursuit of Happyness, The (2006)' | 5 |
---|
3 | 'Step Up (2006)' | 5 |
---|
4 | 'Illusionist, The (2006)' | 5 |
---|
5 | 'Idiocracy (2006)' | 5 |
---|
6 | 'Prestige, The (2006)' | 5 |
---|
7 | 'Blood Diamond (2006)' | 5 |
---|
8 | 'Shooter (2007)' | 5 |
---|
9 | 'Fracture (2007)' | 5 |
---|
10 | 'Live Free or Die Hard (2007)' | 5 |
---|
11 | 'City of Lost Children, The (Cité des enfants perdus, La) (1995)' | 0.5000 |
---|
12 | 'Shanghai Triad (Yao a yao yao dao waipo qiao) (1995)' | 0.5000 |
---|
13 | 'Postman, The (Postino, Il) (1994)' | 0.5000 |
---|
14 | 'French Twist (Gazon maudit) (1995)' | 0.5000 |
---|
15 | 'Misérables, Les (1995)' | 0.5000 |
---|
16 | 'Antonia's Line (Antonia) (1995)' | 0.5000 |
---|
17 | 'Hate (Haine, La) (1995)' | 0.5000 |
---|
18 | 'Rumble in the Bronx (Hont faan kui) (1995)' | 0.5000 |
---|
19 | 'Beauty of the Day (Belle de jour) (1967)' | 0.5000 |
---|
20 | 'Umbrellas of Cherbourg, The (Parapluies de Cherbourg, Les) (1964)' | 0.5000 |
---|
Next Steps
To give you a sense of where you might take this in a project, here are some things you might investigate next with this dataset.
- We didn't really look at how you would use the SVD to make recommendations. It turns out the SVD can be used to come up with good guesses for the missing values in the original ratings matrix (the NaNs) and you can then provide recommendations based tailored for a praticular user.
- We didn't quantify how well the svd predicted the ratings. In order to do that, you could divide the ratings into a training and test set and see how well your SVD model can predict the test ratings (i.e., a rating set that wasn't used to compute the SVD).
- We filled in the missing values with the means of each movie, but there are variants of SVD that can handle the missing values directly (they do entail tradeoffs). You could inverstigate how one of those methods would work on this data.
function movieExtremes = getHighAndLowMovies(v, movies)
% return a cell array with the most positive and most negative
% components of the right singular vector v.
movieExtremes = cell(nHighLow*2, 3);
movieExtremes(1:nHighLow,1) = movies(indices(end-(nHighLow-1):end),2);
movieExtremes(1:nHighLow,2) = movies(indices(end-(nHighLow-1):end),3);
movieExtremes(1:nHighLow,3) = num2cell(c(end-(nHighLow-1):end));
movieExtremes(1+nHighLow:end,1) = movies(indices(1:nHighLow),2);
movieExtremes(1+nHighLow:end,2) = movies(indices(1:nHighLow),3);
movieExtremes(1+nHighLow:end,3) = num2cell(c(1:nHighLow));
function userRatings = getHighAndLowUserRatings(userIndex, movies, ratings)
% return a cell array with the most positive and most negative reviews
% given by the specified user
userRatings = cell(nHighLow*2,2);
[r, indices] = sort(ratings(userIndex,:));
indices = indices(~isnan(r));
userRatings(1:nHighLow,1) = movies(indices(end-(nHighLow-1):end),2);
userRatings(1:nHighLow,2) = num2cell(r(end-(nHighLow-1):end));
userRatings(1+nHighLow:end,1) = movies(indices(1:nHighLow),2);
userRatings(1+nHighLow:end,2) = num2cell(r(1:nHighLow));