Logical analysis of data (LAD) is an important subfield of supervised machine learning and data mining. It is a methodology
for data analysis, which uses concepts of optimization, combinatorics and Boolean functions. LAD is a binary classification that used for
Boolean data with high explanatory power. Because patterns are the most important building blocks in LAD, they must be selected
carefully. One of the main drawbacks in LAD, which needs to be addressed, is the quality of the generated patterns and extraction of
positive and negative patterns. By these quality patterns, we can classify new observations with high accuracy. The proposed
methodology developed to address this issue. It studied the LAD method, its refinements, and define quality measures for pattern
generation. Then, contribute to improving the pattern selection procedures using an optimization technique called Mixed Integer-Linear
Programs (MILP) and the General Algebraic Modelling System (GAMS) tools using MIP solver. Using this technique for generating an
optimized set of patterns aims at selecting the most important patterns to improve pattern quality, and get very strong results with a high
accuracy. Experiments carried out on the SPECT dataset, it shows the efficiency of the proposed method in regards to minimize the
number of generated patterns and increase the accuracy of the classification model.
Published In:IJCSN Journal Volume 7, Issue 6
Date of Publication : December 2018
Pages : 349-360
Figures :17
Tables : 01
Abdulkareem Owehan Alresheedi :
Currently is a Master student
at Qassim University in Computer college, Computer Science
department. Alresheedi received his Bachelor degree in computer
Science in 2004. From 2004 until now, I joined the public sector as
the manager of the Information Technology management. My
research interests lie in data mining, machine learning algorithms,
advance optimization techniques and big data.
Mohammed Abdullah Al-Hagery :
received his B.Sc in
Computer Science from the University of Technology in
Baghdad Iraq-1994. He got his MSc in Computer Science from
the University of Science and Technology Yemen-1998. Al-
Hagery finished his PhD in Computer Science and in
Information Techonlogy, (Software Engineering) from the
Faculty of Computer Science and IT, University of Putra
Malaysia (UPM), 2004. He was a head of the Computer Science
Department at the college of Science and Engineering, USTY,
Sana'a from 2004 to 2007. From 2007 to this date, he is a staff
member at the Faculty of Computer, Department of Computer
Science, Qassim University in KSA. He published more than 15
papers in international journals. Dr Al-Hagery was appointed a
head of the Research Centre at the Computer College, Qassim
University, KSA from September 2012 to October 2018.
Logical Data Analysis, Optimization Techniques, Patterns Reduction, Machine Learning, Classification Accuracy, Set
Covering Problem
This paper introduced voluble contribution regarding
the optimization and selection of high-quality patterns. The
results show a high efficiency to improve the quality
patterns selection procedures and helps researchers in
future for the continued development of optimization
techniques related to classification problems.
1) For each pattern, calculate and extract the main criteria
that help in pattern selection procedures. For that, R codes
are developed to extract the hidden information (patterns)
from binary datasets and its characteristics such as Degree,
Homogeneity, and a number of positive (negative) covered
observations and show all index of all covered
observations.
[1] H. H. Kim and J. Y. Choi, "Hierarchical multi-class
LAD based on OvA-binary tree using genetic
algorithm," Expert Syst. Appl., vol. 42, no. 21, pp.
8134-8145, 2015.
[2] Y. Crama, P. L. Hammer, and T. Ibaraki, "Cause-Effect
Realtionships and Partially Defined Boolean Functions,"
Ann. Oper. Res., vol. 16, pp. 299-325, 1988.
[3] A. Ragab, X. de Carné de Carnavalet, S. Yacout, and
M.-S. Ouali, "Face recognition using multi-class Logical
Analysis of Data," Pattern Recognit. Image Anal., vol.
27, no. 2, pp. 276-288, 2017.
[4] E. Boros, P. Hammer, and T. Ibaraki, "An
implementation of logical analysis of data," Knowl.
Data ., vol. 12, no. 2, pp. 292-306, 2000.
[5] H. S. Ryoo and I. Y. Jang, "MILP approach to pattern
generation in logical analysis of data," Discret. Appl.
Math., vol. 157, no. 4, pp. 749-761, 2009.
[6] T. Li, C. Zhang, and M. Ogihara, "A comparative study
of feature selection and multiclass classification methods
for tissue classification based on gene expression," vol.
20, no. 15, pp. 2429-2437, 2004.
[7] M. A. Hearst, S. T. Dumais, E. Osman, J. Platt, and B.
Scholkopf, "Support vector machines," IEEE Intell.
Syst., vol. 13, pp. 18-28, 1998.
[8] G. P. Zhang, "Neural networks for classification: a
survey," IEEE Trans. Syst. Man Cybern. Part C
(Applications Rev., vol. 30, no. 4, pp. 451-462, 2000.
[9] M. S. Lauer, S. Alexe, C. E. Pothier Snader, E. H.
Blackstone, H. Ishwaran, and P. L. Hammer, "Use of the
logical analysis of data method for assessing long-term
mortality risk after exercise electrocardiography,"
Circulation, vol. 106, no. 6, pp. 685-690, 2002.
[10] S. Alexe, E. Blackstone, P. L. Hammer, H. Ishwaran, M.
S. Lauer, and C. E. Pothier Snader, "Coronary Risk
Prediction by Logical Analysis of Data," Ann. Oper.
Res., vol. 119, no. 1-4, 2003.
[11] A. Reddy, H. Wang, H. Yu, T. O. Bonates, V. Gulabani,
J. Azok, G. Hoehn, P. L. Hammer, A. E. Baird, and K. C.
Li, "Logical Analysis of Data (LAD) model for the early
diagnosis of acute ischemic stroke," BMC Med. Inform.
Decis. Mak., vol. 8, no. 1, p. 30, 2008.
[12] S. A. Brooks, A. R. Brannon, J. S. Parker, J. C. Fisher, O.
Sen, M. W. Kattan, A. A. Hakimi, J. J. Hsieh, T. K.
Choueiri, P. Tamboli, J. K. Maranchie, P. Hinds, C. R.
Miller, M. E. Nielsen, and W. K. Rathmell,
"ClearCode34: A prognostic risk predictor for localized
clear cell renal cell carcinoma," Eur. Urol., vol. 66, no. 1,
pp. 77-84, 2014.
[13] A. Abd-Elhamed, Y. Shaban, and S. N. Mahmoudi,
"Predicting Dynamic Response of Structures under
Earthquake Loads Using Logical Analysis of Data,"
Buildings, vol. 8, no. 4, p. 61, 2018.
[14] M.-A. Mortada, T. Carroll III, S. Yacout, and A. Lakis,
"Rogue components: Their effect and control using
logical analysis of data," J. Intell. Manuf., vol. 23, no. 2,
pp. 289-302, 2012.
[15] J. F. Avila-Herrera and M. M. Subasi, "Logical analysis
of multi-class data," in Proceedings - 2015 41st Latin
American Computing Conference, CLEI 2015, 2015.
[16] P. L. Hammer, "Logical Analysis of Data: From
Combinatorial Optimization to Medical Applications,"
vol. 08854, pp. 1-18.
[17] P. L. Hammer, A. Kogan, and M. A. Lejeune,
"Modeling country risk ratings using partial orders," Eur.
J. Oper. Res., vol. 175, no. 2, pp. 836-859, 2006.
[18] Y. Crama, P. L. Hammer, and T. Ibaraki, "Cause-effect
relationships and partially defined Boolean functions,"
Ann. Oper. Res., vol. 16, no. 1, pp. 299-325, 1988.
[19] R. Bruni, "Reformulation of the support set selection
problem in the logical analysis of data," Ann. Oper. Res.,
vol. 150, no. 1, pp. 79-92, 2007.
[20] E. Boros, P. L. Hammer, T. Ibaraki, and A. Kogan,
"Logical analysis of numerical data," Math. Program.,
vol. 79, no. 1-3, pp. 163-190, 1997.
[21] E. Boros, T. Ibaraki, and K. Makino, "Logical analysis
of binary data with missing bits," Artif. Intell., vol. 107,
no. 2, pp. 219-263, 1999.
[22] G. Alexe and P. L. Hammer, "Spanned patterns for the
logical analysis of data," vol. 154, pp. 1039-1049, 2006.
[23] R. M. Sotnezov, "Genetic algorithms for problems of
logical data analysis in discrete optimization and image
recognition," Pattern Recognit. Image Anal., vol. 19, no.
3, pp. 469-477, 2009.
[24] M. Anthony and J. Ratsaby, "Robust cutpoints in the
logical analysis of numerical data," Discret. Appl. Math.,
vol. 160, no. 4-5, pp. 355-364, 2012.
[25] J. Felix and A. Herrera, "R utcor R esearch R eport A
New Approach to Select Significant Patterns in Logical
Analysis of Data A New Approach to Select Significant
Patterns in Logical Analysis of Data," 2012.
[26] A. Ghasemi, "Optimal Replacement using Logical
Analysis of Data (LAD) and Dynamic Programming," p.
2015, 2015.
[27] C. Guo and H. S. Ryoo, "Compact MILP models for
optimal and Pareto-optimal LAD patterns," Discret.
Appl. Math., vol. 160, no. 16-17, pp. 2339-2348, 2012.
[28] K. Yan and H. S. Ryoo, "0-1 multilinear programming
as a unifying theory for LAD," Discret. Appl. Math., vol.
218, pp. 21-39, 2017.
[29] S. Alexe and P. L. Hammer, "Accelerated algorithm for
pattern detection in logical analysis of data," vol. 154,
pp. 1050-1063, 2006.
[30] G. Alexe, S. Alexe, P. L. Hammer, and A. Kogan,
"Comprehensive vs. comprehensible classifiers in
logical analysis of data," Discret. Appl. Math., vol. 156,
no. 6, pp. 870-882, 2008.