Predicting popularity of Reddit posts using machine learning
Master thesis
Permanent lenke
http://hdl.handle.net/11250/2620520Utgivelsesdato
2019-06Metadata
Vis full innførselSamlinger
- Studentoppgaver (TN-IDE) [823]
Sammendrag
Using data from the social network Reddit, we see if there are ways to predict if a submission will gain popularity, going into detail in components of a Reddit post, and try to determine if it will be successful. Analyzing the data, we see multiple factors that have impacts on what can help achieve a successful post. Timing is a big one, but mostly initial interactions with the submissions can make or break it, which can often be exploited by users upvoting content with fake accounts. Then using machine learning to see if it is possible to predict how many upvotes a submission will achieve. Random forest model achieved the best results with a mean absolute error close to 500, but it changes based on which subreddit. The findings in this thesis can help show the weakness in social media networks, relying on a minority of users the control of what will get popular, and show what elements have an impact on the number of upvotes.
Beskrivelse
Master's thesis in Computer science