(Publisher of Peer Reviewed Open Access Journals)

International Journal of Advanced Technology and Engineering Exploration (IJATEE)

ISSN (Print):2394-5443    ISSN (Online):2394-7454
Volume-8 Issue-75 February-2021
Full-Text PDF
Paper Title : Comparison of affinity degree classification with four different classifiers in several data sets
Author Name : Rosyazwani Mohd Rosdan, Wan Suryani Wan Awang and Wan Aezwani Wan Abu Bakar
Abstract :

The affinity notion has been widely used in research fields. Thus, in this research, affinity is employed to find the degree between two data sets and classify through prediction. But, as Affinity Degree (AD) classification is a new technique, the comparison with different classification types is needed to test the compatibility technique. Herein, this study compares various machine learning techniques and determines the most efficient classification technique based on the data set. Four different classification algorithms, K-Nearest Neighbour (KNN), Naive Bayes (NB), Decision Tree (J48), and Support Vector Machine (SVM), were used as other techniques to compare with AD classification. Three different data sets, breast cancer, acute inflammation, and iris plant, were used for experiment purposes. The results show J48 has the best rate in performance measures compare to the other four classifiers. However, the results of AD classification show the significance that more studies can improve it.

Keywords : Affinity degree (AD), K-nearest neighbour (KNN), Naive bayes (NB), Decision tree (J48), Support vector machine (SVM).
Cite this article : Rosdan RM, Awang WS, Abu Bakar WA. Comparison of affinity degree classification with four different classifiers in several data sets. International Journal of Advanced Technology and Engineering Exploration. 2021; 8(75):247-257. DOI:10.19101/IJATEE.2020.762106.
References :
[1]Li Z, Kim J, Regnier FE. Mobile affinity sorbent chromatography. Analytical Chemistry. 2018; 90(3):1668-76.
[Google Scholar]
[2]Asseraf Y, Shoham A. The “tug of war” model of foreign product purchases. European Journal of Marketing. 2016; 5(3-4):550-74.
[Crossref] [Google Scholar]
[3]Bakhouya M, Gaber J. Approaches for engineering adaptive systems in ubiquitous and pervasive environments. Journal of Reliable Intelligent Environments. 2015; 1(2):75-86.
[Crossref] [Google Scholar]
[4]Chen YW, Larbani M, Hsieh CY, Chen CW. Introduction of affinity set and its application in data-mining example of delayed diagnosis. Expert Systems with Applications. 2009; 36(8):10883-9.
[Crossref] [Google Scholar]
[5]Awang WS, Deris MM, Rana OF, Zarina M, Rose AN. Affinity replica selection in distributed systems. In international conference on parallel computing technologies 2019 (pp. 385-99). Springer, Cham.
[Crossref] [Google Scholar]
[6]Bost R, Popa RA, Tu S, Goldwasser S. Machine learning classification over encrypted data. In NDSS 2015 (pp. 1-14).
[Crossref] [Google Scholar]
[7]Cover T, Hart P. Nearest neighbor pattern classification. IEEE Transactions on Information Theory. 1967; 13(1):21-7.
[Crossref] [Google Scholar]
[8]Sonawane JM, Gaikwad SD, Prakash G. Microarray data classification using dual tree m-band wavelet features. International Journal of Advances in Signal and Image Sciences. 2017; 3(1):19-24.
[Crossref] [Google Scholar]
[9]Prasatha VS, Alfeilate HA, Hassanate AB, Lasassmehe O, Tarawnehf AS, Alhasanatg MB, et al. Effects of distance measure choice on KNN classifier performance-a review. arXiv preprint arXiv:1708.04321. 2017.
[Google Scholar]
[10]Nikam SS. A comparative study of classification techniques in data mining algorithms. Oriental Journal of Computer Science & Technology. 2015; 8(1):13-9.
[Google Scholar]
[11]Pelillo M. Alhazen and the nearest neighbor rule. Pattern Recognition Letters. 2014; 38:34-7.
[Crossref] [Google Scholar]
[12]Hand DJ, Yu K. Idiot s Bayes—not so stupid after all? International Statistical Review. 2001; 69(3):385-98.
[Crossref] [Google Scholar]
[13]Patel HH, Prajapati P. Study and analysis of decision tree based classification algorithms. International Journal of Computer Sciences and Engineering. 2018; 6(10):74-8.
[Google Scholar]
[14]Durgesh KS, Lekha B. Data classification using support vector machine. Journal of theoretical and applied information technology. 2010; 12(1):1-7.
[Google Scholar]
[15]https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Coimbra#. Accessed 15 February 2020.
[16]https://archive.ics.uci.edu/ml/datasets/Acute+Inflammations. Accessed 06 December 2020.
[17]http://archive.ics.uci.edu/ml/datasets/Iris/. Accessed 06 December 2020.
[18]Halim RE, Zulkarnain EA. The effect of consumer affinity and country image toward willingness to buy. The Journal of Distribution Science. 2017; 15(4):15-23.
[Google Scholar]
[19]Dancey CP, Reidy J. Statistics without maths for psychology. Pearson Education; 2007.
[Google Scholar]
[20]Assegie TA. An optimized K-Nearest neighbor based breast cancer detection. Journal of Robotics and Control. 2021; 2(3):115-8.
[Google Scholar]
[21]Tien Bui D, Pradhan B, Lofman O, Revhaug I. Landslide susceptibility assessment in Vietnam using support vector machines, decision tree, and Naive Bayes Models. Mathematical problems in Engineering. 2012; 2(3):115-8.
[Crossref] [Google Scholar]
[22]Pharswan R, Singh J. Performance analysis of SVM and KNN in breast cancer classification: a survey. In internet of things and big data applications 2020 (pp. 133-40). Springer, Cham.
[Crossref] [Google Scholar]
[23]Thirunavukkarasu K, Singh AS, Rai P, Gupta S. Classification of IRIS dataset using classification based KNN algorithm in supervised learning. In international conference on computing communication and automation 2018 (pp. 1-4). IEEE.
[Crossref] [Google Scholar]
[24]Mahdikhani L, Keyvanpour MR. Challenges of data mining classification techniques in mammograms. In 5th conference on knowledge based engineering and innovation (KBEI) (pp. 637-43). IEEE.
[Crossref] [Google Scholar]
[25]Saritas MM, Yasar A. Performance analysis of ANN and naive bayes classification algorithm for data classification. International Journal of Intelligent Systems and Applications in Engineering. 2019; 7(2):88-91.
[Crossref] [Google Scholar]
[26]Hamoud A, Hashim AS, Awadh WA. Predicting student performance in higher education institutions using decision tree analysis. International Journal of Interactive Multimedia and Artificial Intelligence. 2018; 5(2):26-31.
[Google Scholar]
[27]https://www.wcrf.org/dietandcancer/cancer-trends/breast-cancer-statistics. Accessed 15 April 2020.
[28]Kuchenbaecker KB, Hopper JL, Barnes DR, Phillips KA, Mooij TM, Roos-Blom MJ, et al. Risks of breast, ovarian, and contralateral breast cancer for BRCA1 and BRCA2 mutation carriers. Jama. 2017; 317(23):2402-16.
[Google Scholar]
[29]Majoor BC, Boyce AM, Bovée JV, Smit VT, Collins MT, Cleton‐Jansen AM, et al. Increased risk of breast cancer at a young age in women with fibrous dysplasia. Journal of Bone and Mineral Research. 2018; 33(1):84-90.
[Crossref] [Google Scholar]
[30]Brinton LA, Brogan DR, Coates RJ, Swanson CA, Potischman N, Stanford JL. Breast cancer risk among women under 55 years of age by joint effects of usage of oral contraceptives and hormone replacement therapy. Menopause. 2018; 25(11):1195-200.
[Crossref] [Google Scholar]
[31]https://www.who.int/cancer/prevention/diagnosis-screening/breast-cancer/en/. Accessed 15 April 2020.
[32]https://medlineplus.gov/ency/article/000495.htm#:~:text=. Accessed 25 January 2021.
[33]https://www.healthline.com/health/acute-nephritic-syndrome. Accessed 25 January 2021.
[34]Ruuska S, Hämäläinen W, Kajava S, Mughal M, Matilainen P, Mononen J. Evaluation of the confusion matrix method in the validation of an automated system for measuring feeding behaviour of cattle. Behavioural Processes. 2018; 148:56-62.
[Crossref] [Google Scholar]
[35]Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics. 2020; 21(1):1-3.
[Crossref] [Google Scholar]