An Approach of Cross-Domain Sentiment Analysis for Opinion Mining

Abstract
Authors
Keywords
Conclusion
References

Now a day’s sentiment analysis is important for various task and applications like market analysis, opinion mining, contextual advertising, etc. Domain generalization remains a challenge in sentiment analysis hence this paper proposed methodologies to perform cross-domain sentiment analysis. In cross-domain sentiment analysis, classifier trained on one domain is used to classify other domain. We create a glossary using labeled data from source domain and unlabeled data from both source and target domain. This glossary is used to handle a feature mismatch problem, and contains clusters of semantically similar words. For generating a glossary, first we calculate the co-occurrence matrix by point wise mutual information (pmi) [22] and using distributional hypothesis [23] we efficiently create a glossary. At test time, this glossary will be used to find the similar words, and hence solve the feature mismatch problem. Proposed methodologies will really outperform and achieve accuracy near to domain adaptation.

Published In : IJCSN Journal Volume 5, Issue 2

Date of Publication : April 2016

Pages : --

Figures :01

Tables : 02

Publication Link : An Approach of Cross-Domain Sentiment Analysis for Opinion Mining

Dr. Mrs. S. P. Khandait : Professor & Head of Information Technology Dept., K.D.K. College of Engineering Nagpur, Maharashtra, India

Dr. P. D. Khandait : Professor & Head of Electronics Engineering Dept., K.D.K. College of Engineering Nagpur, Maharashtra, India

Mr. Pravin D. Jambhulkar : Assistant Professor of Computer Technology Dept., K.D.K. College of Engineering Nagpur, Maharashtra, India

cross-domain, sentiment glossary, feature vector, distributional relatedness

We present an approach of cross-domain sentiment analysis in which we create a sentiment glossary which will used to handle the feature mismatch problem of cross-domain sentiment classification. We created a glossary using labeled and unlabeled instances of source and target domain in which we apply PMI and distributional relatedness measure to compute the co-occurrences and similarity among the words. Finally we present how to extend review features which will further used to train a binary classifier.

[1] P. Sanju And t.T.Mirnalinee, “Cross Domain Sentiment Classification By Extracting Best Opinion Features” Australian Journal of Basic And Applied Sciences ISSN: 1991-8178 EISSN: 2309-8414, 2016 [2] Alejandro Moreo Fern_andez, “Distributional Correspondence Indexing for Cross-Lingual and Cross-Domain Sentiment Classification.” Journal of Arti_cial Intelligence Research 55 (2015) 131-163, 2016 [3] Danushka Bollegala, David Weir, and John Carroll, “Cross-Domain Sentiment Classification Using a Sentiment Sensitive Thesaurus”, IEEE transactions on knowledge and data engineering, VOL. 25, NO. 8, August 2013. [4] Sinno Jialin Pan, Xiaochuan Niz, Jian-Tao Sunz, Qiang Yangy, Zheng Chen, “Cross-Domain Sentiment Classification viaSpectral Feature Alignment”, 19th Int’l Conf. World Wide Web (WWW’10). [5] J. Blitzer, M. Dredze, F. Pereira, “Domain Adaptation for Sentiment Classification”, 45th Annv. Meeting of the Assoc. Computational Linguistics (ACL’07). [6] T. Briscoe, J. Carroll, and R. Watson, “The Second Release of the RASP System,” Proc. COLING/ACL Interactive Presentation Sessions Conf., 2006. [7] T. Joachims, “Text Categorization with Support Vector Machines: Learning with Many Relevant Features,” Proc. 10th European Conf. Machine Learning (ECML ’98), pp. 137-142, 1998. [8] D. Lin, “Automatic Retrieval and Clustering of Similar Words,” Proc. Ann. Meeting of the Assoc. Computational Linguistics (ACL ’98), pp. 768-774, 1998. [9] P. Turney, “Similarity of Semantic Relations,” Computational Linguistics, vol. 32, no. 3, pp. 379-416, 2006. [10] D. Lin, “Automatic Retrieval and Clustering of Similar Words,” Proc. Ann. Meeting of the Assoc. Computational Linguistics (ACL ’98), pp. 768-774, 1998. [11] S. Sarawagi and A. Kirpal, “Efficient Set Joins on Similarity Predicates,” Proc. ACM SIGMOD Int’l Conf. Management of Data, pp. 743-754, 2004. [12] M. Hu and B. Liu, “Mining and Summarizing Customer Reviews,” Proc. 10th ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining (KDD ’04), pp. 168-177, 2004. [13] S. Xie, W. Fan, J. Peng, O. Verscheure, and J. Ren. Latent space domain transfer between high dimensional overlapping distributions. In 8th International World Wide Web Conference, pages 91–100, April 2009. [14] J.M. Wiebe, “Learning Subjective Adjective from Corpora,” Proc. 17th Nat’l Conf. Artificial Intelligence and 12th Conf. Innovative Applications of Artificial Intelligence (AAAI ’00), pp. 735-740, 2000. [15] T.-K. Fan and C.-H. Chang, “Sentiment-Oriented Contextual Advertising,” Knowledge and Information Systems, vol. 23, no. 3, pp. 321-344, 2010. [16] D. Lin, “Automatic Retrieval and Clustering of Similar Words,” Proc. Ann. Meeting of the Assoc. Computational Linguistics (ACL ’98), pp. 768-774, 1998. [17] N. Jindal and B. Liu. Opinion spam and analysis. In Proceedings of the international conference on Web search and web data mining, pages 219–230, Palo Alto, California, USA, 2008. ACM. [18] S. Xie, W. Fan, J. Peng, O. Verscheure, and J. Ren. Latent space domain transfer between high dimensional overlapping distributions. In 8th International World Wide Web Conference, pages 91–100, April 2009 [19] B. Pang and L. Lee, “Opinion Mining and Sentiment Analysis,” Foundations and Trends in Information Retrieval, vol. 2, nos. 1/2, pp. 1-135, 2008. [20] Ms Kranti Ghag and Dr. Ketan Shah, “Comparative Analysis of the Techniques for Sentiment Analysis”, ICATE 2013 [21] A.Y. Ng, “Feature Selection, l1 vs. l2 Regularization, and Rotational Invariance,” Proc. 21st Int’l Conf. Machine Learning (ICML ’04), 2004. [22] Daum´e III.H, Abhishek.K, Avishek.S(2010), ‘Frustratingly Easy Semi-Supervised Domain Adaptation’, Proceedings of the 2010 Workshop on Domain Adaptation for Natural Language Processing, ACL 2010 pp. 53–59. [23] P. Pantel and D. Ravichandran, “Automatically Labeling Semantic Classes,” Proc. Conf. North Am. Ch. Assoc. for Computational Linguistics: Human Language Technologies (NAACL-HLT ’04), pp. 321-328, 2004. [24] Gregory Grefenstette, “Automatic Thesaurus Generation from Raw Text using Knowledge-Poor Technique” Making sense of Words, 9th Annual Conference of the UW Centre for the New OED and Text Research, 1993.