Sentiment analysis on Rock Werchter 2015, based on twitter tweets

Big data is a hot topic nowadays. Companies all over the world begin to gain interest in big data and advance analytics. It can bring an added value to predict new aspects, gain insight in customer behaviour and reveal hidden patterns that nobody had ever seen before.

In this post, a specific aspect of advanced analytics is brought forward, i.e., sentiment analysis, and applied to analyse twitter tweets of Rock Werchter, one of the largest festivals of Belgium. Purpose of this analysis was to investigate the succes rate of the festival on twitter, among all days and stages. Particularly, interest is given on the top 10 bands with the highest average 'positive' and 'negative' sentiment score.


Examples of tweets that contain #RW15


The Process

  • Data Extraction:
    The data that was used to analyse Rock Werchter 2015 was scraped from twitter, where every tweet needed to contain #RW15. This was done by an R script, which uses an API from twitter that enables the user to collect tweets from twitter. A total of 12000 tweets were scraped from twitter, and cleaned for data quality purpose. As result, 9565 tweets were used that contains information about date of tweet, text, count of retweets, etc.
  • Data analysis and design:


    Description of sentiment analysis (ref.: Wikipedia)

    To gain insight in our collected data, descriptive statistics were performed. Tweets were collected between 23 June 2015 (10:23:25) till 29 July 2015 (20:54:02). A mean of 2252 tweets were scraped per day, with a maximum of 3185 tweets on 24 June 2015 (start of Rock Werchter). Overall, 43% of the tweets were posted by females, and 57% by males. Since Rock Werchter 2015 had a broad line up, descriptive statistics were performed per group. As result, a mean of 85 tweets were collected per group. Every group had a minimum of 34 tweets. For the analysis part, sentiment analysis was performed based on different dictionaries. A total of 15000 positive words and 11436 negative words (in Dutch and English) were collected and screened for sentiment value. These values were then used to perform sentiment analysis on our collected data.

  • Outcome and results:


    Sentiment analysis of RW15 in Tableau

    As result, bands like Fufanu and Kid Ink scored high on twitter. Around 75% of the tweets were analyzed as positive, and saturday was considered as best festival day on twitter. At last, we observe that bands with the highest sentiment score were playing on saturday and sunday, while the bands with the lowest ones were allocated on thursday and friday. Possible reason for this pattern is the drop out of Foo Fighters on thursday, because of Dave Grohl's leg injury.

  • Remark:
    Scripts for twitter scraping and sentiment analysis were made in R, while visualizations were performed in Tableau.
  • Contact details

    Template design by Andrew Yuan - Used under CC BY 3.0. Modified by Martial Luyts.