Authorship Identification is subfield of
authorship analysis deals with finding the plausible author
of anonymous messages. The Authorship identification
problem of online messages is challenging task because
cyber predators make use of obscurity of Cyberspace and
conceal the identity. By performing the forensic analysis of
online messages, empirical evidence can be collected. These
evidences can be used to prosecute the cybercriminal in a
court and punish the guilty. This way cybercrimes can be
minimized up to certain extent by detecting the true
indentities.Therefore it is required to build up innovative
tools & techniques to appropriately analyze large volumes of
suspicious online messages. This paper compares the
Performance of various classifiers in terms of accuracy for
authorship identification task of online messages. Support
Vector Machines, KNN, and Naïve Bayes classifiers are used
for performing experimentation .This paper also investigate
the appropriate classifier for solving authorship of
anonymous online messages in the context of cyber forensics.
Smita Nirkhi : has completed M.Tech in Computer
Science & Engineering & currently Pursuing PHD in computer
science. She has received RPS grant of 8 lakhs from AICTE for her
Research. She has attended 6 STTPworkshops along with other
training programs. She has Published 20 papers in international
conferences & 12 papers in international journals. She had presented
paper at International Conference at Singapore. She has 13 years of
professional experience. Her areas of interest include soft computing,
Data mining, web mining, pattern recognition, MANET, Digital
Forensics.
Dr. R. V. Dharaskar : Former Director DES(Disha-DIMAT) Group of Institutes
Raipur, Chhattisgarh, India
Dr. V. M. Thakare : Department of Computer Science & Engineering
S.G.B Amravati University, Amravati, Maharashtra, India
Authorship Identification
Cybercrime
Cyber
Forensics
Support Vector Machine
K-NN
Naïve Bayes
We have designed and proposed a technique for
performing forensics of online messages to help the
investigators to collect practical evidence by automatically
analyzing large collection of suspicious online messages.
The analysis is performed on the textual contents of a
message. The proposed technique used the frequency of
common words from the training and testing data.
Function word usage and unique word usage by each
author can work as discriminator to uniquely identify the
plausible author of disputed text. SVM outperforms Naïve
Bays and K-NN classifiers. Different parameter settings of
authorship identification had an impact on performance.
[1] S.Argamon, M. Koppel, J. Pennebaker and
J.Schler, “Automatically Profiling the Author of an
Anonymous Text “, Communications of the ACM, 52
(2): 119-123, 2009.
[2] Koppel, M., Schler, J., & Argamon, S. Computational
Methods in Authorship Attribution .Journal of the
American Society for Information Science and
Technology,2009, ,(pp. 60(1):9–26).
[3] R. Zheng, J. Li, H. Chen, Z. Huang. "A framework for
authorship identification of online messages: Writingstyle
features and classification techniques", Journal of
the American Society for Information Science and
Technology, 57(3), pp.378-393, 2006.
[4] Stamatatos, E. (2009). A Survey of Modern
Authorship Attribution Methods. Journal of the
American Society for Information Science and
Technology, 60(3), 238-556.
[5] R. Hadjidj, M. Debbabi, H. Lounis, F. Iqbal, A.
Szporer, and D. Benredjem, “Towards an integrated email
forensic analysis framework”, Digital
Investigation, 5(3-4):124 – 137, 2009.
[6] Smita Nirkhi and R.V.Dharaskar,” Comparative study
of authorship identification techniques for cyber
forensics analysis”, International journal of advanced
computer science and application, vol. 4, no. 5, 2013
[7] Ahmed Abbasi and Hsinchun Chen,” Writeprints: A
stylometric approach to identity-level identification
and similarity detection in cyberspace”, ACM
Transactions on Information Systems (TOIS), 26(2),
2008.
[8] Michael Brennan, Sadia Afroz, and Rachel
Greenstadt,”Adversarial stylometry: Circumventing
authorship recognition to preserve privacy and
anonymity”, ACM Trans. Inf. Syst. Security. 15, 3,
Article 12 (November 2012).
[9] Michael Brennan and Rachel Greenstadt. Practical
Attacks Against Authorship Recognition Techniques
in Proceedings of the Twenty-First Conference on
Innovative Applications of Artificial Intelligence
(IAAI), Pasadena, California, July 2009.
[10] S.M.Nirkhi, R. V. Dharaskar, V.M.Thakre, “Analysis
of online messages for identity tracing in cybercrime
investigation”, 2012 International Conference on
Cyber Security, Cyber Warfare and Digital Forensic
(CyberSec), pp. 300 - 305, 2012.