ACCENTS Journals

Download PDF

Balancing communication efficiency and model performance in federated learning using sparse models

Mohd Ahmed¹ and Rajendra Kumar¹

Department of Computer Science,Jamia Millia Islamia,New Delhi-110025,India¹

Corresponding Author : Mohd Ahmed

Recieved : 29-March-2025; Revised : 20-March-2026; Accepted : 24-March-2026

Abstract

Federated learning (FL) faces significant challenges in balancing communication efficiency and model performance, particularly when deploying complex architectures such as residual network (ResNet-18) on heterogeneous, real-world datasets like the Canadian Institute for Advanced Research (CIFAR-10). This paper investigates the combined effects of model complexity, dataset difficulty, and non-independent and identically distributed (non-IID) data distributions on the sparsified FL method, FedSparseT. It provides a comprehensive analysis of the role of sparsity in reducing communication overhead while maintaining model accuracy. Through systematic experimentation across varying sparsity levels (0.025–0.4), the trade-offs among compression efficiency, convergence stability, and final model performance are evaluated in resource-constrained, non-IID environments. The results demonstrate that higher sparsity levels reduce communication costs by up to 17% (achieving a transmitted size of 70.47 KB compared to 84.76 KB at 2.5% sparsity) while also delivering improved final accuracy (0.52 at the 200th round). Furthermore, it is observed that the architectural complexity of ResNet-18, combined with the fine-grained classification requirements of CIFAR-10, creates a highly challenging scenario under non-IID conditions. In such settings, sparsification may inadvertently prune critical parameters, thereby slowing feature learning, as evidenced by accuracy below 0.2 during early training rounds. Although sparsity functions as an implicit regularization mechanism (final loss of 2.12 for 4% sparsity versus 2.23 for 2.5%), its effectiveness is limited using fixed thresholds that fail to adapt to evolving model requirements and client heterogeneity.

Keywords

Federated learning (FL), Sparsification, Non-IID data distribution, Communication efficiency, ResNet-18, CIFAR-10 dataset.

Cite this article

Ahmed M, Kumar R. Balancing communication efficiency and model performance in federated learning using sparse models. International Journal of Advanced Technology and Engineering Exploration. 2026;13(136):393-409. DOI : 10.19101/IJATEE.2025.121220419

References

[1]

Mcmahan B, Moore E, Ramage D, Hampson S, Yarcas BA. Communication-efficient learning of deep networks from decentralized data. In artificial intelligence and statistics 2017 (pp. 1273-82). PMLR.

[Google Scholar]

[2]

Li T, Sahu AK, Talwalkar A, Smith V. Federated learning: challenges, methods, and future directions. IEEE signal processing magazine. 2020; 37(3):50-60.

[Crossref] [Google Scholar]

[3]

Xu J, Glicksberg BS, Su C, Walker P, Bian J, Wang F. Federated learning for healthcare informatics. Journal of Healthcare Informatics Research. 2021; 5(1):1-9.

[Crossref] [Google Scholar]

[4]

Alghamdi I, Anagnostopoulos C, Pezaros DP. Time-optimized task offloading decision making in mobile edge computing. In wireless days (WD) 2019 (pp. 1-8). IEEE.

[Crossref] [Google Scholar]

[5]

Vani K, Swornambiga SP. Adaptive intrusion detection framework for enhanced cloud security in fog and edge computing environments. International Journal of Advanced Technology and Engineering Exploration. 2024; 11(121):1613-40.

[Crossref] [Google Scholar]

[6]

Quan PK, Kundroo M, Kim T. Experimental evaluation and analysis of federated learning in edge computing environments. IEEE Access. 2023; 11:33628-39.

[Crossref] [Google Scholar]

[7]

Shi Z, Gong D, Yan X. A comprehensive review of federated learning: concepts, aggregation methods, applications, and challenges. In international conference on logistics, informatics and service sciences 2024 (pp. 1023-34). Singapore: Springer Nature Singapore.

[Crossref] [Google Scholar]

[8]

Chen H, Wang H, Long Q, Jin D, Li Y. Advancements in federated learning: models, methods, and privacy. ACM Computing Surveys. 2024; 57(2):1-39.

[Crossref] [Google Scholar]

[9]

Zheng X, Ying S, Zheng F, Yin J, Zheng L, Chen C, et al. Federated learning on non-iid data via local and global distillation. In international conference on web services (ICWS) 2023 (pp. 647-57). IEEE.

[Crossref] [Google Scholar]

[10]

Zhu X, Li J, Liu Y, Ma C, Wang W. A survey on model compression for large language models. Transactions of the Association for Computational Linguistics. 2024; 12:1556-77.

[Crossref] [Google Scholar]

[11]

Hoefler T, Alistarh D, Ben-nun T, Dryden N, Peste A. Sparsity in deep learning: pruning and growth for efficient inference and training in neural networks. Journal of Machine Learning Research. 2021; 22(241):1-24.

[Google Scholar]

[12]

Cheng Y, Wang D, Zhou P, Zhang T. Model compression and acceleration for deep neural networks: the principles, progress, and challenges. IEEE Signal Processing Magazine. 2018; 35(1):126-36.

[Crossref] [Google Scholar]

[13]

Sattler F, Wiedemann S, Müller KR, Samek W. Robust and communication-efficient federated learning from non-iid data. Transactions on Neural Networks and Learning Systems. 2019; 31(9):3400-13.

[Crossref] [Google Scholar]

[14]

Di X, Fan X, Chen L, Li M, Zhang M. Communication-privacy-accuracy trade-offs in federated learning for non-IID data with shuffle model. Knowledge-Based Systems. 2025; 324:113872.

[Crossref] [Google Scholar]

[15]

Le DD, Tran AK, Pham TB, Huynh TN. A survey of model compression and its feedback mechanism in federated learning. In proceedings of the 5th ACM workshop on intelligent cross-data analysis and retrieval 2024 (pp. 37-42). ACM.

[Crossref] [Google Scholar]

[16]

Deng X, Li D, Sun T, Lu X. Communication-efficient distributed learning via sparse and adaptive stochastic gradient. IEEE Transactions on Big Data. 2024; 11(1):234-46.

[Crossref] [Google Scholar]

[17]

Wangni J, Wang J, Liu J, Zhang T. Gradient sparsification for communication-efficient distributed optimization. Advances in Neural Information Processing Systems. 2018.

[Google Scholar]

[18]

Han P, Wang S, Leung KK. Adaptive gradient sparsification for efficient federated learning: an online learning approach. In 40th international conference on distributed computing systems (ICDCS) 2020 (pp. 300-10). IEEE.

[Crossref] [Google Scholar]

[19]

Stich SU, Karimireddy SP. The error-feedback framework: SGD with delayed gradients. Journal of Machine Learning Research. 2020; 21(237):1-36.

[Google Scholar]

[20]

Haddadpour F, Kamani MM, Mokhtari A, Mahdavi M. Federated learning with compression: unified analysis and sharp guarantees. In international conference on artificial intelligence and statistics 2021 (pp. 2350-8). PMLR.

[Google Scholar]

[21]

Ozfatura E, Ozfatura K, Gündüz D. Time-correlated sparsification for communication-efficient federated learning. In international symposium on information theory (ISIT) 2021 (pp. 461-6). IEEE.

[Crossref] [Google Scholar]

[22]

Zhao Z, Mao Y, Liu Y, Song L, Ouyang Y, Chen X, et al. Towards efficient communications in federated learning: a contemporary survey. Journal of the Franklin Institute. 2023; 360(12):8669-703.

[Crossref] [Google Scholar]

[23]

Guastella A, Sani L, Iacob A, Mora A, Bellavista P, Lane ND. SparsyFed: sparse adaptive federated learning. In the thirteenth international conference on learning representations 2025 (pp.1-33).

[Google Scholar]

[24]

Kanchan S, Jang JW, Yoon JY, Choi BJ. Efficient and privacy-preserving group signature for federated learning. Future Generation Computer Systems. 2023; 147:93-106.

[Crossref] [Google Scholar]

[25]

Zhang Y, Lin M, Lin Z, Luo Y, Li K, Chao F, et al. Learning best combination for efficient n: M sparsity. Advances in Neural Information Processing Systems. 2022; 35:941-53.

[Google Scholar]

[26]

Beitollahi M, Liu M, Lu N. DSFL: dynamic sparsification for federated learning. In 5th international conference on communications, signal processing, and their applications (ICCSPA) 2022 (pp. 1-6). IEEE.

[Crossref] [Google Scholar]

[27]

Gong X, Song L, Vedula R, Sharma A, Zheng M, Planche B, et al. Federated learning with privacy-preserving ensemble attention distillation. IEEE Transactions on Medical Imaging. 2022; 42(7):2057-67.

[Crossref] [Google Scholar]

[28]

Hu R, Gong Y, Guo Y. Federated learning with sparsification-amplified privacy and adaptive optimization. In proceedings of the thirtieth international joint conference on artificial intelligence 2021 (pp.1-7).

[Google Scholar]

[29]

Chung WC, Lo CA, Lin YH, Chen ZH, Hung CL. Decentralized federated learning with non-IID data: challenges, trends, and future opportunities. ACM Computing Surveys. 2026; 58(8):1-41.

[Crossref] [Google Scholar]

[30]

Asad M, Shaukat S, Hu D, Wang Z, Javanmardi E, Nakazato J, et al. Limitations and future aspects of communication costs in federated learning: a survey. Sensors. 2023; 23(17):1-31.

[Crossref] [Google Scholar]

[31]

Ye Y, Li S, Liu F, Tang Y, Hu W. EdgeFed: optimized federated learning based on edge computing. IEEE Access. 2020; 8:209191-8.

[Crossref] [Google Scholar]

[32]

Lan G, Liu XY, Zhang Y, Wang X. Communication-efficient federated learning for resource-constrained edge devices. IEEE Transactions on Machine Learning in Communications and Networking. 2023; 1:210-24.

[Crossref] [Google Scholar]

[33]

Xiong Y, Wang R, Cheng M, Yu F, Hsieh CJ. Feddm: iterative distribution matching for communication-efficient federated learning. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 16323-32). IEEE.

[Google Scholar]

[34]

Chai ZY, Yang CD, Li YL. Communication efficiency optimization in federated learning based on multi-objective evolutionary algorithm. Evolutionary Intelligence. 2023; 16(3):1033-44.

[Crossref] [Google Scholar]

[35]

Konecný J, Mcmahan HB, Felix XY, Richtárik P, Suresh AT, Bacon D. Federated learning: strategies for improving communication efficiency. 29th conference on neural information processing systems (NIPS 2016), Barcelona, Spain 2016 (pp.1-5).

[Google Scholar]

[36]

Jiang Z, Xu Y, Xu H, Wang Z, Liu J, Chen Q, et al. Computation and communication efficient federated learning with adaptive model pruning. IEEE Transactions on Mobile Computing. 2023; 23(3):2003-21.

[Crossref] [Google Scholar]

[37]

Tang J, Ding X, Hu D, Guo B, Shen Y, Ma P, et al. FedRAD: heterogeneous federated learning via relational adaptive distillation. Sensors. 2023; 23(14):1-17.

[Crossref] [Google Scholar]

[38]

Song Y, Liu H, Zhao S, Jin H, Yu J, Liu Y, et al. Fedadkd: heterogeneous federated learning via adaptive knowledge distillation. Pattern Analysis and Applications. 2024; 27(4):1-18.

[Crossref] [Google Scholar]

[39]

Kim M, Saad W, Debbah M, Hong CS. SpaFL: communication-efficient federated learning with sparse models and low computational overhead. Advances in Neural Information Processing Systems. 2024; 37:86500-27.

[Google Scholar]

[40]

Rao AS, Muhamed A, Diddee H. Less is fed more: sparsity reduces feature distortion in federated learning. In proceedings of the 1st workshop on customizable NLP: progress and challenges in customizing NLP for a domain, application, group, or individual (CustomNLP4U) 2024 (pp. 37-46). Association for Computational Linguistics.

[Crossref] [Google Scholar]

[41]

Yu D, Yuan Y, Zou Y, Zhang X, Liu Y, Cui L, et al. Pruning-based adaptive federated learning at the edge. IEEE Transactions on Computers. 2025; 74(5):1538-48.

[Crossref] [Google Scholar]

[42]

Wang H, Liu Z, Hoshino K, Zhang T, Walters JP, Crago S. FedPaI: achieving extreme sparsity in federated learning via pruning at initialization. In international conference on intelligent computing 2025 (pp. 350-64). Singapore: Springer Nature Singapore.

[Crossref] [Google Scholar]

[43]

Xie W, Li H, Ma J, Li Y, Lei J, Liu D, et al. Jointsq: joint sparsification-quantization for distributed learning. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition 2024 (pp. 5778-87). IEEE.

[Google Scholar]

[44]

Cheng A, Wang P, Zhang XS, Cheng J. Differentially private federated learning with local regularization and sparsification. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition 2022 (pp. 10122-31). IEEE.

[Google Scholar]

[45]

Zheng J, Tang J. Communication-efficient federated learning based on compressed sensing and ternary quantization. Applied Intelligence. 2025; 55(2):1-13.

[Crossref] [Google Scholar]

[46]

Sreenivasan K, Sohn JY, Yang L, Grinde M, Nagle A, Wang H, et al. Rare gems: finding lottery tickets at initialization. Advances in Neural Information Processing Systems. 2022; 35:14529-40.

[Google Scholar]

[47]

Bistritz I, Mann A, Bambos N. Distributed distillation for on-device learning. Advances in Neural Information Processing Systems. 2020; 33:22593-604.

[Crossref] [Google Scholar]

[48]

Jiang Y, Wang S, Valls V, Ko BJ, Lee WH, Leung KK, et al. Model pruning enables efficient federated learning on edge devices. IEEE Transactions on Neural Networks and Learning Systems. 2022; 34(12):10374-86.

[Crossref] [Google Scholar]

[49]

Hu P, Peng X, Zhu H, Aly MM, Lin J. Opq: compressing deep neural networks with one-shot pruning-quantization. In proceedings of the AAAI conference on artificial intelligence 2021 (pp. 7780-8). PKP Publishing Services Network.

[Crossref] [Google Scholar]

[50]

Li J, Zhang Y, Li Y, Gong X, Wang W. FedSparse: a communication-efficient federated learning framework based on sparse updates. Electronics. 2024; 13(24):1-22.

[Crossref] [Google Scholar]

[51]

Verma S, Pesquet JC. Sparsifying networks via subdifferential inclusion. In international conference on machine learning 2021 (pp. 10542-52). PMLR.

[Google Scholar]

[52]

Wang Z, Duan Q, Xu Y, Zhang L. An efficient bandwidth-adaptive gradient compression algorithm for distributed training of deep neural networks. Journal of Systems Architecture. 2024; 150:103116.

[Crossref] [Google Scholar]

[53]

Li D. Eb-fedavg: personalized and training efficient federated learning with early-bird tickets. In southwest data science conference 2022 (pp. 213-26). Cham: Springer Nature Switzerland.

[Crossref] [Google Scholar]

[54]

Luo JH, Wu J, Lin W. Thinet: a filter level pruning method for deep neural network compression. In proceedings of the IEEE international conference on computer vision 2017 (pp. 5058-66). IEEE.

[Google Scholar]

[55]

Zhou X, Zhang W, Xu H, Zhang T. Effective sparsification of neural networks with global sparsity constraint. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition 2021 (pp. 3599-608). IEEE.

[Google Scholar]

[56]

Sattler F, Wiedemann S, Müller KR, Samek W. Sparse binary compression: towards distributed deep learning with minimal communication. In international joint conference on neural networks (IJCNN) 2019 (pp. 1-8). IEEE.

[Crossref] [Google Scholar]

[57]

He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In proceedings of the IEEE conference on computer vision and pattern recognition 2016 (pp. 770-8). IEEE.

[Crossref] [Google Scholar]

[58]

Cai R, Chen X, Liu S, Srinivasa J, Lee M, Kompella R, et al. Many-task federated learning: a new problem setting and a simple baseline. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition 2023 (pp. 5037-45). IEEE.

[Crossref] [Google Scholar]

[59]

Huba D, Nguyen J, Malik K, Zhu R, Rabbat M, Yousefpour A, et al. Papaya: practical, private, and scalable federated learning. Proceedings of Machine Learning and Systems. 2022; 4:814-32.

[Google Scholar]

[60]

Ak KE, Lim JH, Tham JY, Kassim AA. Attribute manipulation generative adversarial networks for fashion images. In proceedings of the IEEE/CVF international conference on computer vision 2019 (pp. 10541-50). IEEE.

[Crossref] [Google Scholar]

[61]

Kundroo M, Kim T. Demystifying impact of key hyper-parameters in federated learning: a case study on CIFAR-10 and FashionMNIST. IEEE Access. 2024; 12:120570-83.

[Crossref] [Google Scholar]

[62]

Wu J. Introduction to convolutional neural networks. National Key Lab for Novel Software Technology. Nanjing University. China. 2017; 5(23):495.

[Google Scholar]

[63]

Zhang X, Zhou X, Lin M, Sun J. Shufflenet: an extremely efficient convolutional neural network for mobile devices. In proceedings of the IEEE conference on computer vision and pattern recognition 2018 (pp. 6848-56). IEEE.

[Crossref] [Google Scholar]

[64]

https://medium.com/@danqing/a-practical-guide-to-relu-b83ca804f1f7. Accessed 15 April 2025.

[65]

Liu Y, Zhao R, Kang J, Yassine A, Niyato D, Peng J. Towards communication-efficient and attack-resistant federated edge learning for industrial internet of things. ACM Transactions on Internet Technology (TOIT). 2021; 22(3):1-22.

[Crossref] [Google Scholar]

[66]

Wei XX, Huang H. Edge devices clustering for federated visual classification: a feature norm based framework. IEEE Transactions on Image Processing. 2023; 32:995-1010.

[Crossref] [Google Scholar]

[67]

Kairouz P, Mcmahan HB. Advances and open problems in federated learning. Foundations and Trends in Machine Learning. 2021; 14(1-2):1-210.

[Crossref] [Google Scholar]

[68]

Chen D, Gao D, Xie Y, Pan X, Li Z, Li Y, et al. Fs-real: towards real-world cross-device federated learning. In proceedings of the 29th SIGKDD conference on knowledge discovery and data mining 2023 (pp. 3829-41). ACM.

[Crossref] [Google Scholar]

[69]

Li Q, Diao Y, Chen Q, He B. Federated learning on non-iid data silos: an experimental study. In IEEE 38th international conference on data engineering (ICDE) 2022 (pp. 965-78). IEEE.

[Crossref] [Google Scholar]

[70]

Li T, Sahu AK, Zaheer M, Sanjabi M, Talwalkar A, Smith V. Federated optimization in heterogeneous networks. Proceedings of Machine learning and systems. 2020; 2:429-50.

[Google Scholar]

[71]

Guendouzi BS, Ouchani S, Assaad HE, Zaher ME. A systematic review of federated learning: challenges, aggregation methods, and development tools. Journal of Network and Computer Applications. 2023; 220:103714.

[Crossref] [Google Scholar]

[72]

Kotz S, Balakrishnan N, Johnson NL. Continuous multivariate distributions, volume 1: models and applications. John Wiley & Sons; 2019.

[Google Scholar]