Paper Title:
Enhancing frequent itemset mining through machine learning and nature-inspired algorithms: a comprehensive review
Author Name:
Ansh Kumar Verma and Animesh Kumar Dubey
Abstract:
Frequent itemset mining is a critical process in data mining, enabling the discovery of patterns and associations within large datasets. Traditional algorithms like Apriori, while effective, struggle with challenges related to high-dimensional data, scalability, and efficiency. This review explores the integration of machine learning, data mining and nature-inspired algorithms to address these challenges. Techniques such as support vector machines (SVM), random forests, and meta-heuristic algorithms like genetic algorithms (GA) have been shown to enhance the feature selection and rule discovery processes in frequent itemset mining. By leveraging these advanced methodologies, frequent pattern (FP) mining can improve both accuracy and efficiency, particularly in applications such as healthcare, finance, and education. Experimental results demonstrate that hybrid approaches combining machine learning with optimization techniques outperform traditional methods in managing large-scale, high-dimensional datasets. Furthermore, innovative pruning strategies and federated frameworks for privacy-preserving mining add layers of security and precision to the data mining process. This paper discussed and highlighted the potential of these integrated techniques to further advance FP mining in complex and big data environments.
Keywords:
Frequent itemset mining, Machine learning, Meta-heuristic algorithms, Data mining, Optimization techniques.
Cite this article:
Verma AK, Dubey AK.Enhancing frequent itemset mining through machine learning and nature-inspired algorithms: a comprehensive review. International Journal of Advanced Computer Research. 2024;14(68):97-103. DOI:10.19101/IJACR.2024.1466022
References:
[1]Vats S, Sharma V, Bajaj M, Singh S, Sagar BB. Advanced frequent itemset mining algorithm (AFIM). In Uncertainty in Computational Intelligence-Based Decision Making 2025 (pp. 187-201). Academic Press.
[2]Fernandez-basso C, Ruiz MD, Martin-bautista MJ. New spark solutions for distributed frequent itemset and association rule mining algorithms. Cluster Computing. 2024; 27(2):1217-34.
[3]Batrel S, Brahme A, Gaikwad J. A comprehensive overview of data mining algorithms. Library Progress International. 2024; 44(3):7201-10.
[4]Aljehani SS, Alotaibi YA. Preserving privacy in association rule mining using metaheuristic-based algorithms: a systematic literature review. IEEE Access. 2024; 12:21217-36.
[5]Mudumba B, Kabir MF. Mine-first association rule mining: an integration of independent frequent patterns in distributed environments. Decision Analytics Journal. 2024; 10:100434.
[6]Hu Z, Shao M, Mi J, Wu W. Mining positive and negative rules via one-sided fuzzy three-way concept lattices. Fuzzy Sets and Systems. 2024; 479:108842.
[7]Feng G, Fan M. Research on learning behavior patterns from the perspective of educational data mining: evaluation, prediction and visualization. Expert Systems with Applications. 2024; 237:121555.
[8]Mazumdar N, Sarma PK. Sequential pattern mining algorithms and their applications: a technical review. International Journal of Data Science and Analytics. 2024: 1-44.
[9]Yosef A, Roth I, Shnaider E, Baranes A, Schneider M. Horizontal learning approach to discover association rules. Computers. 2024; 13(3):62.
[10]Trasierras AM, Luna JM, Fournier-viger P, Ventura S. Data heterogeneitys impact on the performance of frequent itemset mining algorithms. Information Sciences. 2024: 120981.
[11]Sadeequllah M, Rauf A, Alnazzawi N. Probabilistic support prediction: fast frequent itemset mining in dense data. IEEE Access. 2024; 12:39330-50.
[12]Krasadakis P, Futia G, Verykios VS, Sakkopoulos E. An end-to-end knowledge graph solution to the frequent itemset hiding problem. Information Sciences. 2024; 672:120680.
[13]Yan Y, Niu X, Zhang Z, Fournier-viger P, Ye L, Min F. Efficient high utility itemset mining without the join operation. Information Sciences. 2024; 681:121218.
[14]Wan X, Han X. Efficient top-k frequent itemset mining on massive data. Data Science and Engineering. 2024: 1-27.
[15]Huynh B, Tung NT, Nguyen TD, Bui QT, Nguyen LT, Yun U. An efficient strategy for mining high-efficiency itemsets in quantitative databases. Knowledge-Based Systems. 2024: 112035.
[16]Zhao X, Zhong X, Han B. Frequent closed high-utility itemset mining algorithm based on leiden community detection and compact genetic algorithm. IEEE Access. 2024; 12:84763-73.
[17]Liu C. Parallel frequent itemset mining algorithm and optimization based on spark. In 5th international conference on artificial intelligence and computer applications 2023 (pp. 948-52). IEEE.
[18]Singla A, Gandhi P. A comprehensive study and analysis of frequent itemsets mining algorithms using diverse real datasets. In international conference on advances in computation, communication and information technology 2023 (pp. 174-80). IEEE.
[19]Yu J, Zhang L, Xu N, Fa L, Yang K. Application of constraint-based frequent closed itemsets mining in TCM clinical data analysis. In international conference on bioinformatics and biomedicine 2023 (pp. 4689-96). IEEE.
[20]Hong TP, Hsu YP, Chen CH, Wu JM. A federated mining framework for complete frequent itemsets. In international conference on systems, man, and cybernetics 2023 (pp. 2483-8). IEEE.
[21]Zhao L, Chen C, Tian W. Mining frequent closed itemsets and generators over uncertain data. In 6th international conference on electronic information and communication technology (ICEICT) 2023 (pp. 453-8). IEEE.
[22]Tian W, Li F, Liu Y, Wang Z, Zhang T. Depth-first uncertain frequent itemsets mining based on ensembled conditional item-wise supports. In international conference on intelligent supercomputing and biopharma 2023 (pp. 121-8). IEEE.
[23]Siva S, Chaudhari S. Cumulative summary list driven lightweight frequent closed high utility itemset mining. In 2nd international conference for innovation in technology 2023 (pp. 1-6). IEEE
[24]Arani ZM, Chehreghani MH, Chehreghani MH. Non-uniform sampling methods for large itemset mining. In IEEE international conference on big data 2023 (pp. 5714-22). IEEE.
[25]Zheng P, Cheng Z, Tian X, Liu H, Luo W, Huang J. Non-interactive privacy-preserving frequent itemset mining over encrypted cloud data. IEEE Transactions on Cloud Computing. 2023; 11(4):3452-68.
[26]Wu JM, Zhou H, Lin JC, Wang K, Liu S, Li R. A novel spark-based algorithm for mining frequent utility patterns. In 6th international conference on artificial intelligence and big data 2023 (pp. 99-104). IEEE.
[27]Luo Y, Han X, Zhang C. Prediction of learning outcomes with a machine learning algorithm based on online learning behavior data in blended courses. Asia Pacific Education Review. 2024; 25(2):267-85.
[28]Ravinder B, Seeni SK, Prabhu VS, Asha P, Maniraj SP, Srinivasan C. Web data mining with organized contents using naive bayes algorithm. In 2nd international conference on computer, communication and control 2024 (pp. 1-6). IEEE.
[29]Garai S, Paul RK, Kumar M, Choudhury A. Intra-annual national statistical accounts based on machine learning algorithm. Journal of Data Science and Intelligent Systems. 2024; 2(3):153-60.
[30]Abdollahzadeh B, Khodadadi N, Barshandeh S, Trojovský P, Gharehchopogh FS, El-kenawy ES, et al. Puma optimizer (PO): a novel metaheuristic optimization algorithm and its application in machine learning. Cluster Computing. 2024:1-49.
[31]Li D. An interactive teaching evaluation system for preschool education in universities based on machine learning algorithm. Computers in Human Behavior. 2024; 157:108211.
[32]Wang J, Xu J, Cheng Q, Kumar R. Research on finance credit risk quantification model based on machine learning algorithm. Academic Journal of Science and Technology. 2024; 10(1):290-8.
[33]Dong X, Dang B, Zang H, Li S, Ma D. The prediction trend of enterprise financial risk based on machine learning arima model. Journal of Theory and Practice of Engineering Science. 2024; 4(1):65-71.
[34]Zeinalnezhad M, Shishehchi S. An integrated data mining algorithms and meta-heuristic technique to predict the readmission risk of diabetic patients. Healthcare Analytics. 2024; 5:100292.
[35]Feng Y, Li H, Wang X, Tian J, Qi Y. Application of machine learning decision tree algorithm based on big data in intelligent procurement. World Journal of Innovation and Modern Technology. 2024; 7(2):108-15.