Go to content

Sentiment analysis of social media posts using Apache Spark by Niels Dommerholt

Big data is a hot topic and one of the practical applications is sentiment analysis on user submitted content. For companies the awareness of sentiment trends of their user base is of great value. In this presentation we will give a practical demonstration how we can leverage Spark to analyse a large volume of social media contributions (reddit comments) and demonstrate how we can reduce this data into manageable information. Apache Spark has shown to be a fast and reliable engine for large scale data processing like this. We will start with a short high level introduction of how we’re approaching the sentiment analysis. We will show the structure of the data and will be diving into the implementation of our Java code and finally the results of our analysis. Niels Dommerholt is an experienced software engineer sharing his passion for developing beautiful applications with his colleagues at JDriven. He recently started a blog (http://niels.nu) on subjects and challenges he encounters. He is passionate about programming, big data and databases and has worked on large big data projects for clients like ING, Philips, the Dutch Forensic Institute and the Dutch Department of Labour. Besides sharing his knowledge and passion with colleagues, customers and through his blog he’s also fond of sharing it during highly entertaining and passionate sessions at meetup’s and conferences. [DTA-9267]

November 7, 2016