Over the past decade humans have experienced
exponential growth in the use of online resources, in
particular social media and microblogging websites such as
Facebook, Twitter, YouTube and also mobile applications
such as WhatsApp, Line, etc. Many companies have
identified these resources as a rich mine of marketing
knowledge. This knowledge provides valuable feedback
which allows them to further develop the next generation of
their product. In this paper, sentiment analysis of a product
is performed by extracting tweets about that product and
classifying the tweets showing it as positive and negative
sentiment. The authors propose a hybrid approach which
combines unsupervised learning in the form of K-means
clustering to cluster the tweets and then performing
supervised learning methods such as Decision Trees and
Support Vector Machines for classification.
Mr. Rishabh Soni : is B.E. (Computer Science and Engineering) from
Rajiv Gandhi Proudyogiki Vishwavidyalaya, Bhopal. Currently
pursuing his M.TECH (Computer Technology and Applications) from
NITTTR Bhopal.
Dr. K. James Mathai : is working as Associate Professor in
Department of Computer Engineering and Applications, NITTTR,
Bhopal.
Twitter
Clustering
Decision Trees
Sentiment
Analysis
Social Media
This paper presents a hybrid mechanism- ‘Cluster-thenpredict
Model’ to improve accuracy of predicting twitter
sentiment. The possibility of combining both
unsupervised learning and supervised learning, in the
form of K-means clustering and Random Forest,
respectively performed better, than various supervised
learning algorithms, such as CART, SVM, logistic
Regression, etc.
[1] Pak, Alexander, and Patrick Paroubek. "Twitter as a
Corpus for Sentiment Analysis and Opinion Mining."
LREC. Vol. 10. 2010.
[2] Vu, Tien-Thanh, et al. "An experiment in integrating
sentiment features for tech stock prediction in twitter."
(2012): 23-38.
[3] Zhang, Linhao. "Sentiment analysis on Twitter with
stock price and significant keyword correlation." PhD
diss., 2013.
[4] Dodd, John. "Twitter Sentiment Analysis."
[5] Mulkalwar, Anurag, and Kavita Kelkar. "Sentiment
Analysis on Movie Reviews Based on Combined
Approach." International Journal of Science and
Research (IJSR) (2012).
[6] Mittal, Anshul, and Arpit Goel. "Stock prediction using
twitter sentiment analysis." Stanford University,
CS229(2011 http://cs229. stanford.
edu/proj2011/GoelMittal-
StockMarketPredictionUsingTwitterSentimentAnalysis.
pdf) (2012).
[7] Wang, Xiaofeng, Matthew S. Gerber, and Donald E.
Brown. "Automatic crime prediction using events
extracted from twitter posts." Social Computing,
Behavioral-Cultural Modeling and Prediction. Springer
Berlin Heidelberg, 2012. 231-238.
[8] Gayo-Avello, Daniel. "I Wanted to Predict Elections with
Twitter and all I got was this Lousy Paper"--A Balanced
Survey on Election Prediction using Twitter Data." arXiv
preprint arXiv:1204.6441 (2012).
[9] Python’s API for Twitter: http://www.tweepy.org/
[10] Bag-of-Words feature extraction technique:
https://en.wikipedia.org/wiki/Bag-of-words_model
[11] Amazon’s Mechanical Turk: https://www.mturk.com/ .