(Publisher of Peer Reviewed Open Access Journals)

International Journal of Advanced Technology and Engineering Exploration (IJATEE)

ISSN (Print):2394-5443    ISSN (Online):2394-7454
Volume-10 Issue-109 December-2023
Full-Text PDF
Paper Title : Anomaly detection in smart contracts based on optimal relevance hybrid features analysis in the Ethereum blockchain employing ensemble learning
Author Name : Sabri Hisham, Mokhairi Makhtar and Azwa Abdul Aziz
Abstract :

Blockchain 2.0 has revolutionized the domain by introducing blockchain as a decentralized application (DApp) development platform, previously recognized mainly in the cryptocurrency sphere. Consequently, the rise of DApp development has inadvertently camouflaged fraudulent activities within smart contracts, leading to substantial losses for investors. Implementing machine learning (ML) approaches can significantly enhance the efficacy of anomaly detection. However, many studies still grapple with selecting the most pertinent features to optimize anomaly detection levels. This challenge intensifies when managing the high-dimensional raw data extracted directly from the Ethereum blockchain network, which falls under the category of big data. Smart contracts, the core of blockchain that governs DApp logic, have increasingly become a haven for fraud. This study focuses on analyzing three primary characteristic components based on contract source code (operation code (opcode), application binary interface (ABI) code, and contract transaction) to develop anomaly detection models in smart contracts using an ensemble hybrid feature strategy. The approach involves two key stages: firstly, reducing the initial feature size through constant, quasi-constant, and variant validation; and secondly, identifying the most relevant feature set using the searching for uncorrelated list of variables (SULOV) method, grounded in the minimum redundancy maximum relevance (MRMR) principle. The anomaly detection model employs a voting ensemble technique, harnessing a dataset of the most pertinent features. The model's effectiveness is gauged by comparing its performance with individual models, including random forest (RF), k-nearest neighbor (KNN), decision tree (DT), linear discriminant analysis (LDA), and stochastic gradient descent (SGD). The findings indicate that the proposed model achieves superior anomaly detection levels, with a determination value measurement rate of 92.99%, outperforming individual classifiers using the 44 most relevant features while minimizing classification time. The model's efficiency is further corroborated through comparative analysis with previous studies and alternative methodologies using the same contract dataset. The proposed ensemble-based model significantly improves anomaly detection in contract source code analysis, employing a minimal and relevant set of features refined through the SULOV method.

Keywords : Ethereum, Blockchain, Smart contract, Features selection, Relevance features, Ensemble method, Anomaly detection.
Cite this article : Hisham S, Makhtar M, Aziz AA. Anomaly detection in smart contracts based on optimal relevance hybrid features analysis in the Ethereum blockchain employing ensemble learning . International Journal of Advanced Technology and Engineering Exploration. 2023; 10(109):1552-1579. DOI:10.19101/IJATEE.2023.10102216.
References :
[1]Hu T, Liu X, Chen T, Zhang X, Huang X, Niu W, et al. Transaction-based classification and detection approach for Ethereum smart contract. Information Processing & Management. 2021; 58(2):102462.
[Crossref] [Google Scholar]
[2]Chen W, Guo X, Chen Z, Zheng Z, Lu Y, Li Y. Honeypot contract risk warning on Ethereum smart contracts. In international conference on joint cloud computing 2020 (pp. 1-8). IEEE.
[Crossref] [Google Scholar]
[3]Bitcoin NS. Bitcoin: a peer-to-peer electronic cash system. 2008.
[Google Scholar]
[4]Wu J, Yuan Q, Lin D, You W, Chen W, Chen C, et al. Who are the phishers? phishing scam detection on Ethereum via network embedding. IEEE Transactions on Systems, Man, and Cybernetics: Systems. 2020; 52(2):1156-66.
[Crossref] [Google Scholar]
[5]Zheng Z, Chen W, Zhong Z, Chen Z, Lu Y. Securing the Ethereum from smart ponzi schemes: identification using static features. ACM Transactions on Software Engineering and Methodology. 2023; 32(5):1-28.
[Crossref] [Google Scholar]
[6]Deepa N, Pham QV, Nguyen DC, Bhattacharya S, Prabadevi B, Gadekallu TR, et al. A survey on blockchain for big data: approaches, opportunities, and future directions. Future Generation Computer Systems. 2022; 131:209-26.
[Crossref] [Google Scholar]
[7]Huang J, He D, Obaidat MS, Vijayakumar P, Luo M, Choo KK. The application of the blockchain technology in voting systems: a review. ACM Computing Surveys (CSUR). 2021; 54(3):1-28.
[Crossref] [Google Scholar]
[8]Christidis K, Devetsikiotis M. Blockchains and smart contracts for the internet of things. IEEE Access. 2016; 4:2292-303.
[Crossref] [Google Scholar]
[9]Berdik D, Otoum S, Schmidt N, Porter D, Jararweh Y. A survey on blockchain for information systems management and security. Information Processing & Management. 2021; 58(1):102397.
[Crossref] [Google Scholar]
[10]Belchior R, Vasconcelos A, Guerreiro S, Correia M. A survey on blockchain interoperability: past, present, and future trends. ACM Computing Surveys (CSUR). 2021; 54(8):1-41.
[Crossref] [Google Scholar]
[11]Qin K, Zhou L, Gervais A. Quantifying blockchain extractable value: how dark is the forest? In symposium on security and privacy (SP) 2022 (pp. 198-214). IEEE.
[Crossref] [Google Scholar]
[12]Rahouti M, Xiong K, Ghani N. Bitcoin concepts, threats, and machine-learning security solutions. IEEE Access. 2018; 6:67189-205.
[Crossref] [Google Scholar]
[13]Liu L, Tsai WT, Bhuiyan MZ, Peng H, Liu M. Blockchain-enabled fraud discovery through abnormal smart contract detection on Ethereum. Future Generation Computer Systems. 2022; 128:158-66.
[Crossref] [Google Scholar]
[14]Wood G. Ethereum: a secure decentralised generalised transaction ledger. Ethereum Project Yellow Paper. 2014; 151(2014):1-32.
[Google Scholar]
[15]Szabo N. Formalizing and securing relationships on public networks. First Monday. 1997; 2(9).
[Crossref] [Google Scholar]
[16]Buterin V. Ethereum white paper: a next generation smart contract & decentralized application platform. First Version. 2014; 53.
[Google Scholar]
[17]Buterin V. A next-generation smart contract and decentralized application platform. White Paper. 2014; 3(37):1-27.
[Google Scholar]
[18]Cheng Z, Hou X, Li R, Zhou Y, Luo X, Li J, et al. Towards a first step to understand the cryptocurrency stealing attack on Ethereum. In international symposium on research in attacks, intrusions and defenses (RAID 2019) 2019 (pp. 47-60). USENIX Association.
[Google Scholar]
[19]Sallam A, Rassem T, Abdu H, Abdulkareem H, Saif N, Abdullah S. Fraudulent account detection in the Ethereum’s network using various machine learning techniques. International Journal of Software Engineering and Computer Systems. 2022; 8(2):43-50.
[Crossref] [Google Scholar]
[20]Camino R, Torres CF, Baden M, State R. A data science approach for detecting honeypots in Ethereum. In international conference on blockchain and cryptocurrency (ICBC) 2020 (pp. 1-9). IEEE.
[Crossref] [Google Scholar]
[21]Hu B, Zhou C, Tian YC, Qin Y, Junping X. A collaborative intrusion detection approach using blockchain for multimicrogrid systems. IEEE Transactions on Systems, Man, and Cybernetics: Systems. 2019; 49(8):1720-30.
[Crossref] [Google Scholar]
[22]Preuveneers D, Rimmer V, Tsingenopoulos I, Spooren J, Joosen W, Ilie-zudor E. Chained anomaly detection models for federated learning: an intrusion detection case study. Applied Sciences. 2018; 8(12):1-21.
[Crossref] [Google Scholar]
[23]Nguyen TD, Pham LH, Sun J, Lin Y, Minh QT. Sfuzz: an efficient adaptive fuzzer for solidity smart contracts. In proceedings of the ACM/IEEE 42nd international conference on software engineering 2020 (pp. 778-88).
[Crossref] [Google Scholar]
[24]Fan S, Fu S, Xu H, Zhu C. Expose your mask: smart ponzi schemes detection on blockchain. In international joint conference on neural networks (IJCNN) 2020 (pp. 1-7). IEEE.
[Crossref] [Google Scholar]
[25]Vasek M, Moore T. Analyzing the bitcoin ponzi scheme ecosystem. In financial cryptography and data security: FC 2018 international workshops, BITCOIN, VOTING, and WTSC, Nieuwpoort, Curaçao 2019 (pp. 101-12). Springer Berlin Heidelberg.
[Crossref] [Google Scholar]
[26]Bartoletti M, Carta S, Cimoli T, Saia R. Dissecting ponzi schemes on Ethereum: identification, analysis, and impact. Future Generation Computer Systems. 2020; 102:259-77.
[Crossref] [Google Scholar]
[27]Zhou Y, Kumar D, Bakshi S, Mason J, Miller A, Bailey M. Erays: reverse engineering Ethereums opaque smart contracts. In 27th USENIX security symposium (USENIX Security 18) 2018 (pp. 1371-85).
[Google Scholar]
[28]Tug S, Meng W, Wang Y. CBSigIDS: towards collaborative blockchained signature-based intrusion detection. In international conference on internet of things (iThings) and IEEE green computing and communications (GreenCom) and IEEE cyber, physical and social computing (CPSCom) and IEEE smart data (SmartData) 2018 (pp. 1228-35). IEEE.
[Crossref] [Google Scholar]
[29]Wang W, Song J, Xu G, Li Y, Wang H, Su C. Contractward: automated vulnerability detection models for Ethereum smart contracts. IEEE Transactions on Network Science and Engineering. 2020; 8(2):1133-44.
[Crossref] [Google Scholar]
[30]Zhang L, Wang J, Wang W, Jin Z, Zhao C, Cai Z, et al. A novel smart contract vulnerability detection method based on information graph and ensemble learning. Sensors. 2022; 22(9):1-25.
[Crossref] [Google Scholar]
[31]Chen W, Zheng Z, Cui J, Ngai E, Zheng P, Zhou Y. Detecting ponzi schemes on Ethereum: towards healthier blockchain technology. In proceedings of the 2018 world wide web conference 2018 (pp. 1409-18).
[Crossref] [Google Scholar]
[32]Chen W, Zheng Z, Ngai EC, Zheng P, Zhou Y. Exploiting blockchain data to detect smart ponzi schemes on Ethereum. IEEE Access. 2019; 7:37575-86.
[Crossref] [Google Scholar]
[33]Jung E, Le TM, Gehani A, Ge Y. Data mining-based Ethereum fraud detection. In international conference on blockchain (Blockchain) 2019 (pp. 266-73). IEEE.
[Crossref] [Google Scholar]
[34]Yan Z, Susilo W, Bertino E, Zhang J, Yang LT. AI-driven data security and privacy. Journal of Network and Computer Applications. 2020; 172:102842.
[Crossref] [Google Scholar]
[35]Peng H, Li J, Wang S, Wang L, Gong Q, Yang R, et al. Hierarchical taxonomy-aware and attentional graph capsule RCNNs for large-scale multi-label text classification. IEEE Transactions on Knowledge and Data Engineering. 2019; 33(6):2505-19.
[Crossref] [Google Scholar]
[36]Pham T, Lee S. Anomaly detection in bitcoin network using unsupervised learning methods. arXiv preprint arXiv:1611.03941. 2016.
[Crossref] [Google Scholar]
[37]Bogner A. Seeing is understanding: anomaly detection in blockchains with visualized features. In proceedings of the international joint conference on pervasive and ubiquitous computing and proceedings of the international symposium on wearable computers 2017 (pp. 5-8). ACM.
[Crossref] [Google Scholar]
[38]Aljofey A, Rasool A, Jiang Q, Qu Q. A feature-based robust method for abnormal contracts detection in Ethereum blockchain. Electronics. 2022; 11(18):1-24.
[Crossref] [Google Scholar]
[39]Chen W, Li X, Sui Y, He N, Wang H, Wu L, et al. Sadponzi: detecting and characterizing ponzi schemes in Ethereum smart contracts. Proceedings of the ACM on Measurement and Analysis of Computing Systems. 2021; 5(2):1-30.
[Crossref] [Google Scholar]
[40]Kamišalić A, Kramberger R, Fister JI. Synergy of blockchain technology and data mining techniques for anomaly detection. Applied Sciences. 2021; 11(17):1-37.
[Crossref] [Google Scholar]
[41]Kumar N, Singh A, Handa A, Shukla SK. Detecting malicious accounts on the Ethereum blockchain with supervised learning. In cyber security cryptography and machine learning: fourth international symposium, Beer Sheva, Israel, proceedings 2020 (pp. 94-109). Springer International Publishing.
[Crossref] [Google Scholar]
[42]Awang MK, Makhtar M, Udin N, Mansor NF. Improving customer churn classification with ensemble stacking method. International Journal of Advanced Computer Science and Applications. 2021; 12(11):277-85.
[Crossref] [Google Scholar]
[43]Awang MK, Makhtar M, Mamat AR. Ensemble selection and combination based on cost function for UCI datasets. Journal of Theoretical and Applied Information Technology. 2021; 99(16):4015-25.
[44]Hisham S, Makhtar M, Aziz AA. Combining multiple classifiers using ensemble method for anomaly detection in blockchain networks: a comprehensive review. International Journal of Advanced Computer Science and Applications. 2022; 13(8):404-22.
[Crossref] [Google Scholar]
[45]Baba NM, Makhtar M, Fadzli SA, Awang MK. Current issues in ensemble methods and its applications. Journal of Theoretical & Applied Information Technology. 2015; 81(2):266-76.
[Google Scholar]
[46]Wang L, Cheng H, Zheng Z, Yang A, Zhu X. Ponzi scheme detection via oversampling-based long short-term memory for smart contracts. Knowledge-Based Systems. 2021; 228:107312.
[Crossref] [Google Scholar]
[47]Lu P, Cai L, Yin K. SourceP: smart ponzi schemes detection on Ethereum using pre-training model with data flow. arXiv preprint arXiv:2306.01665. 2023.
[Crossref] [Google Scholar]
[48]Zhang L, Chen W, Wang W, Jin Z, Zhao C, Cai Z, et al. Cbgru: a detection method of smart contract vulnerability based on a hybrid model. Sensors. 2022; 22(9):1-24.
[Crossref] [Google Scholar]
[49]Durieux T, Ferreira JF, Abreu R, Cruz P. Empirical review of automated analysis tools on 47,587 Ethereum smart contracts. In proceedings of the 42nd international conference on software engineering 2020 (pp. 530-41). ACM/IEEE.
[Crossref] [Google Scholar]
[50]Grieco G, Song W, Cygan A, Feist J, Groce A. Echidna: effective, usable, and fast fuzzing for smart contracts. In proceedings of the 29th SIGSOFT international symposium on software testing and analysis 2020 (pp. 557-60). ACM.
[Crossref] [Google Scholar]
[51]Huang J, Zhou K, Xiong A, Li D. Smart contract vulnerability detection model based on multi-task learning. Sensors. 2022; 22(5):1-24.
[Crossref] [Google Scholar]
[52]Ferreira JF, Cruz P, Durieux T, Abreu R. Smartbugs: a framework to analyze solidity smart contracts. In proceedings of the 35th international conference on automated software engineering 2020 (pp. 1349-52). IEEE/ACM.
[Crossref] [Google Scholar]
[53]Fan S, Fu S, Xu H, Cheng X. Al-SPSD: anti-leakage smart ponzi schemes detection in blockchain. Information Processing & Management. 2021; 58(4):102587.
[Crossref] [Google Scholar]
[54]Chen J, Xia X, Lo D, Grundy J, Luo X, Chen T. Defectchecker: automated smart contract defect detection by analyzing EVM bytecode. IEEE Transactions on Software Engineering. 2021; 48(7):2189-207.
[Crossref] [Google Scholar]
[55]Vivar AL, Castedo AT, Orozco AL, Villalba LJ. An analysis of smart contracts security threats alongside existing solutions. Entropy. 2020; 22(2):1-29.
[Crossref] [Google Scholar]
[56]Torres CF, Steichen M. The art of the scam: demystifying honeypots in Ethereum smart contracts. In 28th USENIX security symposium (USENIX Security 19) 2019 (pp. 1591-607).
[Google Scholar]
[57]Sun X, Lin X, Liao Z. An ABI-based classification approach for Ethereum smart contracts. In international conference on dependable, autonomic and secure computing, international conference on pervasive intelligence and computing, international conference on cloud and big data computing, 2021 (pp. 99-104). IEEE.
[Crossref] [Google Scholar]
[58]Asha J, Meenakowshalya A. Fake news detection using n-gram analysis and machine learning algorithms. Journal of Mobile Computing, Communications & Mobile Networks. 2021; 8(1):33-43.
[Crossref] [Google Scholar]
[59]Aljofey A, Jiang Q, Rasool A, Chen H, Liu W, Qu Q, et al. An effective detection approach for phishing websites using URL and HTML features. Scientific Reports. 2022; 12(1):1-19.
[Crossref] [Google Scholar]
[60]Zhao Z, Anand R, Wang M. Maximum relevance and minimum redundancy feature selection methods for a marketing machine learning platform. In international conference on data science and advanced analytics (DSAA) 2019 (pp. 442-52). IEEE.
[Crossref] [Google Scholar]
[61]Gollapalli M, Alansari A, Alkhorasani H, Alsubaii M, Sakloua R, Alzahrani R, et al. A novel stacking ensemble for detecting three types of diabetes mellitus using a Saudi Arabian dataset: Pre-diabetes, T1DM, and T2DM. Computers in Biology and Medicine. 2022; 147:1-12.
[Crossref] [Google Scholar]
[62]Farhana N, Firdaus A, Darmawan MF, Ab RMF. Evaluation of Boruta algorithm in DDoS detection. Egyptian Informatics Journal. 2023; 24(1):27-42.
[Crossref] [Google Scholar]