èŠ‚ç‚¹æ–‡çŒ®

è”é‚¦å¦ä¹ ä¸å±€éƒ¨å’Œå…¨å±€åç§»çš„è”åˆåŠ¨æ€æ ¡æ£ç®—æ³•

Joint dynamic correction algorithms for local and global drifts in federated learning

æŽ¨è CAJä¸‹è½½
PDFä¸‹è½½
ä¸æ”¯æŒè¿…é›·ç‰ä¸‹è½½å·¥å…·ï¼Œè¯·å–æ¶ˆåŠ é€Ÿå·¥å…·åŽä¸‹è½½ã€‚

ã€ä½œè€…ã€‘ æˆšé“¶åŸŽï¼› éœäºšç³ï¼› çŽ‹å®ï¼› ä¾¯ç¦¹ï¼›

ã€Authorã€‘ Qi Yincheng;Huo Yalin;Wang Ning;Hou Yu;Department of Electronic and Communication Engineering, North China Electric Power University;Hebei Key Laboratory of Power Internet of Things Technology, North China Electric Power University;Wuhan Branch, State Grid Hubei Comprehensive Energy Service Co., Ltd.;

ã€é€šè®¯ä½œè€…ã€‘ æˆšé“¶åŸŽ;

ã€æœºæž„ã€‘ åŽåŒ—ç”µåŠ›å¤§å¦ç”µåä¸Žé€šä¿¡å·¥ç¨‹ç³»ï¼› åŽåŒ—ç”µåŠ›å¤§å¦æ²³åŒ—çœç”µåŠ›ç‰©è”ç½‘æŠ€æœ¯é‡ç‚¹å®žéªŒå®¤ï¼› å›½ç½‘æ¹–åŒ—ç»¼åˆèƒ½æºæœåŠ¡æœ‰é™å…¬å¸æ¦æ±‰åˆ†å…¬å¸ï¼›

ã€æ‘˜è¦ã€‘ ç›®çš„ åœ¨è”é‚¦å¦ä¹ åœºæ™¯ä¸ï¼Œç”±äºŽå„å®¢æˆ·ç«¯æ•°æ®åˆ†å¸ƒçš„ä¸ä¸€è‡´ï¼Œä¼šå¯¼è‡´å„å®¢æˆ·ç«¯çš„å±€éƒ¨ç›®æ ‡ä¹‹é—´åå·®è¾ƒå¤§ï¼Œä»¥åŠå…¨å±€å¹³å‡æ¨¡åž‹åç¦»å…¨å±€æœ€ä¼˜ï¼Œå½±å“æ¨¡åž‹è®ç»ƒçš„æ”¶æ•›é€Ÿåº¦å’Œæ¨¡åž‹ç²¾åº¦ã€‚é’ˆå¯¹éžç‹¬ç«‹åŒåˆ†å¸ƒæ•°æ®å¯¼è‡´çš„å…¨å±€æ¨¡åž‹æ”¶æ•›ç¼“æ…¢ä»¥åŠæ¨¡åž‹å‡†ç¡®çŽ‡è¾ƒä½Žçš„é—®é¢˜ï¼Œæå‡ºä¸€ç§è”åˆåŠ¨æ€æ ¡æ£çš„è”é‚¦å¦ä¹ ç®—æ³•ï¼ˆfederated learning algorithm for joint dynamic correction, FedJDCï¼‰ï¼Œåˆ†åˆ«ä»Žå®¢æˆ·ç«¯å’ŒæœåŠ¡å™¨ç«¯è¿›è¡Œä¼˜åŒ–ã€‚æ–¹æ³• ä¸ºäº†é™ä½Žå±€éƒ¨æ¨¡åž‹æ›´æ–°åç§»çš„å½±å“ï¼Œå®šä¹‰ç´¯ç§¯åç§»åº¦æ¥è¡¡é‡å„å‚ä¸Žå®¢æˆ·ç«¯çš„æ•°æ®éžç‹¬ç«‹åŒåˆ†å¸ƒç¨‹åº¦ï¼Œå¹¶åœ¨æœ¬åœ°æŸå¤±å‡½æ•°ä¸å¼•å…¥åŠ¨æ€çº¦æŸé¡¹ï¼Œæ ¹æ®ç´¯ç§¯åç§»åº¦åŠ¨æ€è°ƒæ•´çº¦æŸé¡¹å¤§å°ï¼Œå¯è‡ªåŠ¨é€‚åº”ä¸åŒç¨‹åº¦çš„éžç‹¬ç«‹åŒåˆ†å¸ƒæ•°æ®ï¼Œå‡å°å±€éƒ¨æ¨¡åž‹çš„æ›´æ–°æ–¹å‘ä¸ä¸€è‡´æ€§ï¼Œä»Žè€Œæé«˜æ¨¡åž‹å‡†ç¡®çŽ‡åŠé€šä¿¡æ•ˆçŽ‡ï¼›å…¶æ¬¡ï¼Œé’ˆå¯¹å…¨å±€æ¨¡åž‹èšåˆåç§»ï¼Œå°†å‚ä¸Žå®¢æˆ·ç«¯ä¸Šä¼ çš„ç´¯ç§¯åç§»åº¦ä½œä¸ºå…¨å±€æ¨¡åž‹èšåˆæƒé‡ï¼Œä»Žè€ŒåŠ¨æ€æ›´æ–°å…¨å±€æ¨¡åž‹ï¼Œå¤§å¹…å‡å°‘é€šä¿¡è½®æ•°ã€‚ç»“æžœ æœ¬æ–‡åœ¨3ä¸ªçœŸå®žæ•°æ®é›†ä¸Šçš„å®žéªŒç»“æžœè¡¨æ˜Žï¼Œä¸Ž4ç§ä¸åŒçš„è”é‚¦å¦ä¹ ç®—æ³•ç›¸æ¯”ï¼Œåœ¨å¤šç§ä¸åŒéžç‹¬ç«‹åŒåˆ†å¸ƒç¨‹åº¦çš„æƒ…å†µä¸‹ï¼ŒFedJDCå¯ä»¥å¹³å‡å‡å°‘62.29%ã€20.90%ã€24.93%å’Œ20.47%çš„é€šä¿¡è½®æ¬¡ï¼Œå¹³å‡æé«˜5.48%ã€1.62%ã€2.10%å’Œ2.28%çš„æ¨¡åž‹å‡†ç¡®çŽ‡ã€‚ç»“è®º æœ¬æ–‡æå‡ºçš„è”é‚¦å¦ä¹ ä¸å±€éƒ¨å’Œå…¨å±€åç§»çš„è”åˆåŠ¨æ€æ ¡æ£ç®—æ³•ä»Žå±€éƒ¨æ¨¡åž‹æ›´æ–°å’Œå…¨å±€æ¨¡åž‹èšåˆä¸¤æ–¹é¢è¿›è¡Œæ”¹è¿›ï¼Œé™ä½Žäº†é€šä¿¡è½®æ¬¡ï¼Œæé«˜äº†å‡†ç¡®çŽ‡ï¼Œå–å¾—äº†è‰¯å¥½çš„æ”¶æ•›æ•ˆæžœã€‚æ›´å¤š è¿˜åŽŸ

ã€Abstractã€‘ Objective Federated learning enables multiple parties to collaboratively train a machine learning model without communicating their local data. In practical applications, the data between nodes usually follow a non-independent identical distribution(non-IID). In the local update, each client model will be optimized toward its local optima(i. e., fitting its individual feature distribution) instead of the global optimal objective and raises a client update drift. Meanwhile, in global updates that aggregate these diverged local models, the server model is further distracted by the set of mismatching local optima, which subsequently leads to a global drift at the server model. To solve the problems of slow global convergence and increasing number of training communication rounds caused by non-IID data, this paper proposes a joint dynamic correction federated learning algorithm(FedJDC) that is optimized from the client and server.Method To reduce the influence of non-IID on federated learning, this paper carries out a joint optimization from the two aspects of local model update and global model update and proposes the FedJDC algorithm. This paper then uses the cosine similarity between the local and global update directions to measure the offset of each participating client. Afterward, given that each client has a different degree of non-IID, if the degree of the model offset is only determined by the cosine similarity calculated in this round, then the model update may become unstable. Therefore, the FedJDC algorithm defines the cumulative offset and introduces the attenuation coefficient Ï. In calculating the cumulative offset of the model, the current and historical cumulative offsets are taken into account. In addition, by changing Ï to reduce the proportion of the cumulative offset of the current round, the influence of the offset of the current round on the final result can be reduced. This paper also proposes a strategy for dynamically adjusting the constraint terms for local model update offset. Specifically, the constraint terms of the local loss function are dynamically adjusted according to the calculated cumulative offset of the local model, and the algorithm is automatically adapted to various non-IID settings without a careful selection of hyperparameters, thus improving the flexibility of the algorithm. To dynamically change the weight of global model aggregation in each round and effectively improve the convergence speed and model accuracy, this paper also designs a dynamic weighted aggregation strategy that takes the accumulated offset uploaded by all clients as the weight of global model aggregation in each round of communication.Result The proposed method is tested on three dataset using different deep learning models. LeNeT5, the VGG16network model, and the ResNet18 network model are used for training in the MNIST, FMNIST, and CIFAR10 datasets, respectively. Four experiments are designed to prove the effectiveness of the proposed algorithm. To verify the accuracy of FedJDC at different degrees of non-IID, the hyperparameter Î² of the Dirichlet distribution is varied, and the performance of different algorithms is compared. Experimental results show that FedJDC can improve the model accuracy by 5. 48%, 1. 62%, 2. 10%, and 2. 28% on average compared with FedAvg, FedProx, FedAdp, and FedLAW, respectively. To evaluate the communication efficiency of FedJDC, the number of communication rounds is counted as FedJDC reaches a target accuracy, and this number is compared with that obtained by other algorithms. Experimental results show that under different degrees of non-IID, FedJDC can reduce communication rounds by 62. 29%, 20. 90%, 24. 93%, and 20. 47% on average compared with FedAvg, FedProx, FedAdp, and FedLAW, respectively. This paper also investigates the effect of the number of local epochs on the accuracy of the final model. Experimental results show that FedJDC outperforms the other four methods under different epochs in terms of final model accuracy. FedJDC also demonstrates better robustness against the larger offset caused by more local update epochs. Ablation experiments also show that each optimization method performs well on all datasets, and FedJDC combines these two strategies to achieve the global optimal performance.Conclusion This paper optimizes the local and global model offsets from two aspects and proposes a joint dynamic correction algorithm for these offsets in federated learning. The cumulative offset is defined, and the attenuation coefficient is introduced into the calculation of the cumulative offset. Considering the historical and current offset information, the size of the cumulative offset is dynamically adjusted to ensure the stability of the training parameter update. The dynamic constraint strategy takes the cumulative offset calculated by the client in each round as the constraint parameter of the client model. The dynamic weighted aggregation strategy changes the weight of each local model during the global model aggregation based on the cumulative offset of each participating client so as to dynamically update the global model in each round.The combination of the two optimization strategies has achieved good results, effectively alleviated the performance degradation of the federated learning model caused by non-IID data, and provided a good foundation for the further implementation of federated learning in this field.æ›´å¤š è¿˜åŽŸ