Prediction of Student's Performance using Selected Classification Methods: A Data Mining Approach

Abstract
Authors
Keywords
Conclusion
References

Educational Data Mining (EDM) research have emerged as an interesting area of research, which are extracting useful knowledge from educational databases for purposes such as predicting student's success. The extracted knowledge helps the institutions to improve their teaching methods and learning process. In this paper, we applied Decision Tree, Naïve Bayes and Neural Network classification methods for predicting the student's performance based on the grade level. This aim to resolve the problem of difficulty in predicting the performance of student's in institutions. The objectives of this paper are to (i) implement three classification methods independently on the student's performance dataset, and (ii) determine the best method among the three classification methods. The results shows that the Decision Tree produces the highest accuracy rate of 77.778%, followed by the Neural Network with accuracy rate of 70.886% and the Naïve Bayes produces the lowest at accuracy rate 66.865%. The result recommends that Decision Tree is used in predicting student's performance rather than Naïve Bayes and Neural Network.

Published In : IJCSN Journal Volume 8, Issue 3

Date of Publication : June 2019

Pages : 276-284

Figures :07

Tables : 05

Abba Babakura : Computer Science Unit, Department of Mathematics, UDUS, Sokoto, Nigeria.

Abubakar Roko : Computer Science Unit, Department of Mathematics, UDUS, Sokoto, Nigeria.

Aminu Bui : Computer Science Unit, Department of Mathematics, UDUS, Sokoto, Nigeria.

Ibrahim Saidu : Computer Science Unit, Department of Mathematics, UDUS, Sokoto, Nigeria.

Jobson Ewalefoh : Department of Political Science, UNISA, South Africa.

Educational Data Mining, Prediction, Student performance, Decision Tree, Neural Network and Naïve Bayes

In this research, an effort is made to find the impact of our proposed features and models on student's performance prediction. Predictions of student performance can be useful in many contexts. In this work, some feature sets are identified that significantly affect the performance of each student and grade level of student's at the end of academic year are predicted. The student's performance dataset is used to experimentally evaluate the performance of three classification methods. We implemented and tested with test counts for each of the methods and obtain the classification results. The classification results shows that the Decision Tree produces higher accuracy rate of 77.778%, followed by the Neural Network with an accuracy rate of 70.886% and the Naïve Bayes produces the lowest with an accuracy rate of 66.865%.

[1] B. K. Baradwaj, S. Pal, "Mining educational data to analyze students' performance". IJACSA, 2012, 2: 63-69. [2] C. Romero and S. Ventura, "Educational data mining: A survey from 1995 to 2005," Expert systems with applications, vol. 33, no. 1, 2007, pp. 135-146. [3] Sarker, Farhana, T. Thanassis, C. D. Hugh, "Student's performance prediction by using institutional internal and external open data sources". CSEDU: 5th International Conference on Computer Supported Education, Germany, 2013. [4] M. Ramaswami and R. Bhaskaran., "A CHAID based performance prediction model in educational data mining," International Journal of Computer Science, vol. 7, no. 1, 2010, pp. 10-18. [5] M. M. A. Tair and A. M. El-Halees, "Mining educational data to improve students' performance: a case study," International Journal of Information, vol. 2, no. 2, 2012, pp. 140- 146. Decision [6] E. Osmanbegovic and M. Suljic., "Data mining approach for predicting student performance," Economic Review, vol. 10, no. 1, 2012, pp. 3-12. [7] A. Altaher, O. BaRukab, "Prediction of student's academic performance based on adaptive neuro-fuzzy inference". IJCSNS, 2017, 17: 165-169. [8] P. Nithya, B. Umamaheswari, A. Umadevi, "A survey on educational data mining in field of education". J Comput Sci Softw Dev, 2016, 1: 1-6. [9] E. Osmanbegovic, S. Mirza, "Data mining approach for predicting student performance". J Econ Bus, 2012, 10: 312. [10] V. Ramesh, P. Parkavi, K. Ramar. "Predicting student performance: A statistical and data mining approach". IJCA, 2013, 63: 35-39. [11] Y. Ma, B. Liu, C. K. Wong, P. S. Yu, and S. M. Lee, "Targeting the right students using data mining," in 6th ACM SIGKDD International Conference on Knowledge Discovery and Data mining (KDD '00), New York, USA, 2000, pp. 457-464. [12] B. Minaei-Bidgoli, D. A. Kashy, G. Kortemeyer, and W. F. Punch, "Predicting student performance: an application of data mining methods with an educational Web-based system," in 33rd Annual Frontiers in Education (FIE 2003), Westminster, CO, 2003. [13] S. Kotsiantis, C. Pierrakeas, and P. Pintelas, "Predicting students' performance in distance learning using machine learning techniques," Applied Artificial Intelligence, vol. 18, no. 5, 2004, pp. 411-426. [14] Z. A. Pardos, N. T. Heffernan, B. Anderson, C. L. Heffernan, and W. P. Schools, "Using fine-grained skill models to fit student performance with Bayesian networks," in Handbook of educational data mining., 2010, pp. 417426. [15] Z. N. Khan, "Scholastic Achievement of Higher Secondary Students in Science Stream", Journal of Social Sciences, Vol. 1, No. 2, 2005, pp. 84-87. [16] Q. A. AI-Radaideh, E. M. AI-Shawakfa, and M. I. AINajjar, "Mining Student Data using Decision Trees", International Arab Conference on Information Technology(ACIT'2006), Yarmouk University, Jordan, 2006. [17] Elaf Abu Amrieh, Thair Hamtini, and Ibrahim Aljarah, The University of Jordan, Amman, Jordan, http://www.Ibrahimaljarah.comwww.ju.edu.jo [18] R. Das, "A comparison of multiple classification methods for diagnosis of Parkinson disease". Expert Syst. Appl. 37(2), 2010, pp. 1568-1572. [19] S. Gnanapriya, R. Suganya, G.S. Devi, M. S.Kumar, "Data mining concepts and techniques". Data Min. Knowl. Eng. 2(9), 2010, pp. 256-263. [20] J. D. Rennie, L. Shih, J. Teevan, D. R. Karger, "Tackling the poor assumptions of naïve bayes text classifiers". In: Proceedings of the International conference on Machine Learning ICML, Vol. 3, 2003, pp. 616-623. [21] M. Can, "Neural networks to diagnose the Parkinson's disease". SouthEast Eur. J. Soft Comput. 2(1), 2013. [22] J. Hossain, Fazlida Mohd Sani, N., A. Mustapha, L. SurianiAffendey, "Using feature selection as accuracy benchmarking in clinical data mining". J. Comput. Sci. 9(7), 883, 2013.