An Experimental Study on Authorship Identification for Cyber Forensics

Abstract
Authors
Keywords
Conclusion
References

Authorship Identification is subfield of authorship analysis deals with finding the plausible author of anonymous messages. The Authorship identification problem of online messages is challenging task because cyber predators make use of obscurity of Cyberspace and conceal the identity. By performing the forensic analysis of online messages, empirical evidence can be collected. These evidences can be used to prosecute the cybercriminal in a court and punish the guilty. This way cybercrimes can be minimized up to certain extent by detecting the true indentities.Therefore it is required to build up innovative tools & techniques to appropriately analyze large volumes of suspicious online messages. This paper compares the Performance of various classifiers in terms of accuracy for authorship identification task of online messages. Support Vector Machines, KNN, and Naïve Bayes classifiers are used for performing experimentation .This paper also investigate the appropriate classifier for solving authorship of anonymous online messages in the context of cyber forensics.

Published In : IJCSN Journal Volume 4, Issue 5

Date of Publication : October 2015

Pages : 756 - 760

Figures :06

Tables : 03

Publication Link : An Experimental Study on Authorship Identification for Cyber Forensics

Smita Nirkhi : has completed M.Tech in Computer Science & Engineering & currently Pursuing PHD in computer science. She has received RPS grant of 8 lakhs from AICTE for her Research. She has attended 6 STTPworkshops along with other training programs. She has Published 20 papers in international conferences & 12 papers in international journals. She had presented paper at International Conference at Singapore. She has 13 years of professional experience. Her areas of interest include soft computing, Data mining, web mining, pattern recognition, MANET, Digital Forensics.

Dr. R. V. Dharaskar : Former Director DES(Disha-DIMAT) Group of Institutes Raipur, Chhattisgarh, India

Dr. V. M. Thakare : Department of Computer Science & Engineering S.G.B Amravati University, Amravati, Maharashtra, India

Authorship Identification

Cybercrime

Cyber Forensics

Support Vector Machine

K-NN

Naïve Bayes

We have designed and proposed a technique for performing forensics of online messages to help the investigators to collect practical evidence by automatically analyzing large collection of suspicious online messages. The analysis is performed on the textual contents of a message. The proposed technique used the frequency of common words from the training and testing data. Function word usage and unique word usage by each author can work as discriminator to uniquely identify the plausible author of disputed text. SVM outperforms Naïve Bays and K-NN classifiers. Different parameter settings of authorship identification had an impact on performance.

[1] S.Argamon, M. Koppel, J. Pennebaker and J.Schler, “Automatically Profiling the Author of an Anonymous Text “, Communications of the ACM, 52 (2): 119-123, 2009. [2] Koppel, M., Schler, J., & Argamon, S. Computational Methods in Authorship Attribution .Journal of the American Society for Information Science and Technology,2009, ,(pp. 60(1):9–26). [3] R. Zheng, J. Li, H. Chen, Z. Huang. "A framework for authorship identification of online messages: Writingstyle features and classification techniques", Journal of the American Society for Information Science and Technology, 57(3), pp.378-393, 2006. [4] Stamatatos, E. (2009). A Survey of Modern Authorship Attribution Methods. Journal of the American Society for Information Science and Technology, 60(3), 238-556. [5] R. Hadjidj, M. Debbabi, H. Lounis, F. Iqbal, A. Szporer, and D. Benredjem, “Towards an integrated email forensic analysis framework”, Digital Investigation, 5(3-4):124 – 137, 2009. [6] Smita Nirkhi and R.V.Dharaskar,” Comparative study of authorship identification techniques for cyber forensics analysis”, International journal of advanced computer science and application, vol. 4, no. 5, 2013 [7] Ahmed Abbasi and Hsinchun Chen,” Writeprints: A stylometric approach to identity-level identification and similarity detection in cyberspace”, ACM Transactions on Information Systems (TOIS), 26(2), 2008. [8] Michael Brennan, Sadia Afroz, and Rachel Greenstadt,”Adversarial stylometry: Circumventing authorship recognition to preserve privacy and anonymity”, ACM Trans. Inf. Syst. Security. 15, 3, Article 12 (November 2012). [9] Michael Brennan and Rachel Greenstadt. Practical Attacks Against Authorship Recognition Techniques in Proceedings of the Twenty-First Conference on Innovative Applications of Artificial Intelligence (IAAI), Pasadena, California, July 2009. [10] S.M.Nirkhi, R. V. Dharaskar, V.M.Thakre, “Analysis of online messages for identity tracing in cybercrime investigation”, 2012 International Conference on Cyber Security, Cyber Warfare and Digital Forensic (CyberSec), pp. 300 - 305, 2012.