(Publisher of Peer Reviewed Open Access Journals)

ACCENTS Transactions on Information Security (TIS)

ISSN (Print):XXXX    ISSN (Online):2455-7196
Volume-8 Issue-32 October-2023
Full-Text PDF
Paper Title : Advancements in big data clustering: methods, applications, and insights
Author Name : Chandan Kumar Soni and Mohan Kumar Patel
Abstract :

The digital age has given rise to an unprecedented influx of data, marking the era of big data. In this landscape, clustering has emerged as a critical element of data analysis, enabling the discovery of latent patterns in vast datasets. This review paper explores the state-of-the-art in big data clustering, encompassing influential research, methodologies, advantages, and limitations. The paper highlights the significant advantages brought by different clustering algorithms, spanning domains from smart grids and education to e-commerce and different operations. However, it also acknowledges limitations such as scalability issues and generalization challenges, underlining the importance of addressing these constraints for future research.

Keywords : Big data clustering, Data mining, Unsupervised learning, Clustering algorithms.
Cite this article : Soni CK, Patel MK. Advancements in big data clustering: methods, applications, and insights. ACCENTS Transactions on Information Security. 2023; 8(32):19-24. DOI:10.19101/TIS.2023.829003.
References :
[1]Mussabayev R, Mladenovic N, Jarboui B, Mussabayev R. How to use K-means for big data clustering?. Pattern Recognition. 2023; 137:109269.
[Google Scholar]
[2]Hu H, Liu J, Zhang X, Fang M. An effective and adaptable k-means algorithm for big data cluster analysis. Pattern Recognition. 2023; 139:109404.
[Google Scholar]
[3]Pina AF, Meneses MJ, Sousa‐Lima I, Henriques R, Raposo JF, Macedo MP. Big data and machine learning to tackle diabetes management. European Journal of Clinical Investigation. 2023; 53(1):e13890.
[Google Scholar]
[4]Alghamdi A. A hybrid method for big data analysis using fuzzy clustering, feature selection and adaptive neuro-fuzzy inferences system techniques: case of Mecca and Medina hotels in Saudi Arabia. Arabian Journal for Science and Engineering. 2023 ; 48(2):1693-714.
[Crossref] [Google Scholar]
[5]Belle A, Thiagarajan R, Soroushmehr SM, Navidi F, Beard DA, Najarian K. Big data analytics in healthcare. BioMed Research International. 2015; 2015.
[Crossref] [Google Scholar]
[6]Dubey A, Gupta U, Jain S. Medical data clustering and classification using TLBO and machine learning algorithms. Computers, Materials and Continua. 2021; 70(3):4523-43.
[Crossref] [Google Scholar]
[7]Jahani H, Jain R, Ivanov D. Data science and big data analytics: a systematic review of methodologies used in the supply chain and logistics research. Annals of Operations Research. 2023:1-58.
[Crossref] [Google Scholar]
[8]Pandey KK, Shukla D. Min–max kurtosis mean distance based k-means initial centroid initialization method for big genomic data clustering. Evolutionary Intelligence. 2023; 16(3):1055-76.
[Google Scholar]
[9]Li J, Herdem MS, Nathwani J, Wen JZ. Methods and applications for artificial intelligence, big data, internet of things, and blockchain in smart energy management. Energy and AI. 2023; 11:100208.
[Crossref] [Google Scholar]
[10]Dubey AK, Kushwaha GR, Shrivastava N. Heterogeneous data mining environment based on dam for mobile computing environments. In international conference on advances in information technology and mobile communication 2011 (pp. 144-9). Berlin, Heidelberg: Springer Berlin Heidelberg.
[Crossref] [Google Scholar]
[11]Hussin SK, Omar YM, Abdelmageid SM, Marie MI. Traditional machine learning and big data analytics in virtual screening: a comparative study. International Journal of Advanced Computer Research. 2020; 10(47):72-88.
[Google Scholar]
[12]El Hilali W, El Manouar A, Idrissi MA. The mediating role of big data analytics in enhancing firms’ commitment to sustainability. International Journal of Advanced Technology and Engineering Exploration. 2021; 8(80):932-44.
[Crossref] [Google Scholar]
[13]He W, Hung JL, Liu L. Impact of big data analytics on banking: a case study. Journal of Enterprise Information Management. 2023; 36(2):459-79.
[Crossref] [Google Scholar]
[14]Izhar A, Rastogi A, Ali SS, Quadri SM, Rizvi SA. Feature-driven label generation for congestion detection in smart cities under big data. International Journal of Advanced Technology and Engineering Exploration. 2022; 9(86):94-110.
[Crossref] [Google Scholar]
[15]Dubey AK, Shandilya SK. A comprehensive survey of grid computing mechanism in J2ME for effective mobile computing techniques. In 5th international conference on industrial and information systems 2010 (pp. 207-12). IEEE.
[Crossref] [Google Scholar]
[16]Guan S, Zhang C, Wang Y, Liu W. Hadoop-based secure storage solution for big data in cloud computing environment. Digital Communications and Networks. 2023.
[Crossref] [Google Scholar]
[17]Rani P, Lamba R, Sachdeva RK, Kumar R, Bathla P. Big data analytics: integrating machine learning with big data using hadoop and mahout. Intelligent Systems and Smart Infrastructure: Proceedings of ICISSI 2022. 2023:366.
[Google Scholar]
[18]Al-Jumaili AH, Muniyandi RC, Hasan MK, Paw JK, Singh MJ. Big data analytics using cloud computing based frameworks for power management systems: status, constraints, and future recommendations. Sensors. 2023; 23(6):2952.
[Crossref] [Google Scholar]
[19]Dubey AK, Shandilya SK. A novel J2ME service for mining incremental patterns in mobile computing. In information and communication technologies: international conference, ICT 2010, Kochi, Kerala, India, (pp. 157-64). Springer Berlin Heidelberg.
[Crossref] [Google Scholar]
[20]Fan L. Research on precision marketing strategy of commercial consumer products based on big data mining of customer consumption. Journal of the Institution of Engineers (India): Series C. 2023; 104(1):163-8.
[Crossref] [Google Scholar]
[21]Marichamy VS, Natarajan V. Blockchain based securing medical records in big data analytics. Data & Knowledge Engineering. 2023; 144:102122.
[Crossref] [Google Scholar]
[22]Du X, He Y, Huang JZ. Random sample partition-based clustering ensemble algorithm for big data. In international conference on big data (Big Data) 2021 (pp. 5885-7). IEEE.
[Crossref] [Google Scholar]
[23]Li C, Yang B, Chen X, Zhang E, Huang H, Li D. Research on smart grid big data’s curve mean clustering algorithm for edge-cloud collaborative application. In international conference on wireless communications and smart grid (ICWCSG) 2021 (pp. 395-8). IEEE.
[Crossref] [Google Scholar]
[24]Wang CL. Research on the core technology of education big data based on data mining. In 6th international conference on big data analytics (ICBDA) 2021 (pp. 5-8). IEEE.
[Crossref] [Google Scholar]
[25]Shanshan F, Zhiqiang R. Analysis of big data complex network structure based on fuzzy clustering algorithm. In international conference on networking, communications and information technology (NetCIT) 2021 (pp. 348-52). IEEE.
[Crossref] [Google Scholar]
[26]Shi Z, Zhang K, Liu B, Zhao Y, Zhang J, Li Z. Classification of e-commerce big data based on iterative fuzzy clustering algorithm. In international conference on intelligent transportation, big data & smart city (ICITBS) 2022 (pp. 78-81). IEEE.
[Crossref] [Google Scholar]
[27]Deng J, Hu J. An investigation into big data of emergency rescue based on an improved DDRfs. In 4th international conference on machine learning, big data and business intelligence (MLBDBI) 2022 (pp. 52-6). IEEE.
[Crossref] [Google Scholar]
[28]Xing W, Wu B, Liang M, Li Y, Cheng L. Research on error calibration method for power big data based on k-means clustering. In 9th international forum on electrical engineering and automation (IFEEA) 2022 (pp. 679-82). IEEE.
[Crossref] [Google Scholar]
[29]Gupta A, Jain S. Optimizing performance of Real-Time Big Data stateful streaming applications on Cloud. In IEEE international conference on big data and smart computing (BigComp) 2022 (pp. 1-4). IEEE.
[Crossref] [Google Scholar]
[30]Mahmud MS, Huang JZ, Ruby R, Ngueilbaye A, Wu K. Approximate clustering ensemble method for big data. IEEE Transactions on Big Data. 2023; 9(4): 1142-55.
[Crossref] [Google Scholar]
[31]Wei C. Research on efficient parallelization of spectral clustering algorithm based on big data. In 2nd international conference on electrical engineering, big data and algorithms (EEBDA) 2023 (pp. 1912-6). IEEE.
[Crossref] [Google Scholar]
[32]Wang C. Fault analysis and research on elevator clustering based on big data. In2023 4th international conference on big data, artificial intelligence and internet of things engineering (ICBAIE) 2023 (pp. 51-5). IEEE.
[Crossref] [Google Scholar]