èŠ‚ç‚¹æ–‡çŒ®

åŸºäºŽä¸‰å€¼å‘äºŒå€¼æ¼”åŒ–çš„BNNå‰ªæžæ–¹æ³•

BNN Pruning Method Based on Evolution from Ternary to Binary

æŽ¨è CAJä¸‹è½½
PDFä¸‹è½½
ä¸æ”¯æŒè¿…é›·ç‰ä¸‹è½½å·¥å…·ï¼Œè¯·å–æ¶ˆåŠ é€Ÿå·¥å…·åŽä¸‹è½½ã€‚

ã€ä½œè€…ã€‘ å¾å›¾ï¼› å¼ åšï¼› æŽé•‡ï¼› é™ˆæ€¡å‡ï¼› ç”³äººå‡ï¼› ç†Šæ³¢æ¶›ï¼› å¸¸çŽ‰æ˜¥ï¼›

ã€Authorã€‘ XU Tu;ZHANG Bo;LI Zhen;CHEN Yining;SHEN Rensheng;XIONG Botao;CHANG Yuchun;College of Microelectronics, Dalian University of Technology;

ã€é€šè®¯ä½œè€…ã€‘ å¸¸çŽ‰æ˜¥;

ã€æœºæž„ã€‘ å¤§è¿žç†å·¥å¤§å¦å¾®ç”µåå¦é™¢ï¼›

ã€æ‘˜è¦ã€‘ é’ˆå¯¹ç›®å‰BNN(Binarized Neural Network)å‰ªæžæ–¹æ³•å˜åœ¨å‰ªæžæ¯”ä¾‹ä½Žã€è¯†åˆ«å‡†ç¡®çŽ‡æ˜¾è‘—ä¸‹é™ä»¥åŠä¾èµ–è®ç»ƒåŽå¾®è°ƒçš„é—®é¢˜ï¼Œæå‡ºäº†ä¸€ç§åŸºäºŽä¸‰å€¼å‘äºŒå€¼æ¼”åŒ–çš„æ»¤æ³¢å™¨çº§çš„BNNå‰ªæžæ–¹æ³•ï¼Œå‘½åä¸ºETB(Evolution from Ternary to Binary)ã€‚ETBæ˜¯åŸºäºŽå¦ä¹ çš„ï¼Œé€šè¿‡åœ¨BNNçš„é‡åŒ–å‡½æ•°ä¸å¼•å…¥å¯è®ç»ƒçš„é‡åŒ–é˜ˆå€¼ï¼Œä½¿æƒé‡å’Œæ¿€æ´»å€¼é€æ¸ä»Žä¸‰å€¼æ¼”åŒ–åˆ°äºŒå€¼æˆ–é›¶ï¼Œæ—¨åœ¨ä½¿ç½‘ç»œåœ¨è®ç»ƒæœŸé—´è‡ªåŠ¨è¯†åˆ«ä¸é‡è¦çš„ç»“æž„ã€‚æ¤å¤–ï¼Œä¸€ä¸ªå‰ªæžçŽ‡è°ƒèŠ‚ç®—æ³•ä¹Ÿè¢«è®¾è®¡ç”¨äºŽè°ƒæŽ§ç½‘ç»œçš„å‰ªæžçŽ‡ã€‚è®ç»ƒåŽï¼Œå…¨é›¶æ»¤æ³¢å™¨å’Œå¯¹åº”çš„è¾“å‡ºé€šé“å¯è¢«ç›´æŽ¥è£å‰ªè€ŒèŽ·å¾—ç²¾ç®€çš„BNN,æ— éœ€å¾®è°ƒã€‚ä¸ºè¯æ˜Žæå‡ºæ–¹æ³•çš„å¯è¡Œæ€§å’Œå…¶æå‡BNNæŽ¨ç†æ•ˆçŽ‡è€Œä¸ç‰ºç‰²å‡†ç¡®çŽ‡çš„æ½œåŠ›ï¼Œåœ¨CIFAR-10ä¸Šè¿›è¡Œå®žéªŒï¼šåœ¨CIFAR-10æ•°æ®é›†ä¸Šï¼ŒETBå¯¹VGG-Smallæ¨¡åž‹è¿›è¡Œäº†46.3%çš„å‰ªæžï¼Œæ¨¡åž‹å¤§å°åŽ‹ç¼©è‡³0.34 MByte,å‡†ç¡®çŽ‡ä¸º89.97%,å¹¶åœ¨ResNet-18æ¨¡åž‹ä¸Šè¿›è¡Œäº†30.01%çš„å‰ªæžï¼Œæ¨¡åž‹å¤§å°åŽ‹ç¼©è‡³1.33 MByte,å‡†ç¡®çŽ‡ä¸º90.79%ã€‚åœ¨å‡†ç¡®çŽ‡å’Œå‚æ•°é‡æ–¹é¢ï¼Œå¯¹æ¯”ä¸€äº›çŽ°æœ‰çš„BNNå‰ªæžæ–¹æ³•ï¼ŒETBå…·æœ‰ä¸€å®šçš„ä¼˜åŠ¿ã€‚æ›´å¤š è¿˜åŽŸ

ã€Abstractã€‘ BNNs(Binarized Neural Networks) are popular due to their extremely low memory requirements. While BNNs can be further compressed through pruning techniques, existing BNN pruning methods suffer from low pruning ratios, significant accuracy degradation, and reliance depending on fine-tuning after training. To overcome these limitations, a filter-level BNN pruning method is proposed based on evolution from ternary to binary, named ETB(Evolution from Terry to Binary). ETB is learning-based, and by introducing trainable quantization thresholds into the quantization function of BNNs, it makes the weights and activation values gradually evolve from ternary to binary or zero, aiming to enable the network to automatically identify unimportant structures during training. And a pruning ratio adjustment algorithm is also designed to regulate the pruning rate of the network. After training, all zero filters and corresponding output channels can be directly pruned to obtain a simplified BNN without fine-tuning. To demonstrate the feasibility of the proposed method and the potential for improving BNN inference efficiency without sacrificing accuracy, experiments are conducted on CIFAR-10. ETB is pruned the VGG-Small model by 46.3%, compressing the model size to 0.34 MB, with an accuracy of 89.97%. The ResNet-18 model is also pruned by 30.01%, compressing the model size to 1.33 MB, with an accuracy of 90.79%. Compared with some existing BNN pruning methods in terms of accuracy and parameter quantity, ETB has certain advantages.æ›´å¤š è¿˜åŽŸ

ã€å…³é”®è¯ã€‘ äºŒå€¼ç¥žç»ç½‘ç»œï¼› å‰ªæžï¼› å¯è®ç»ƒé—¨é™ï¼› æ¼”åŒ–ï¼›
ã€Key wordsã€‘ binarized neural networkï¼› pruningï¼› trainable thresholdï¼› evolutionï¼›

ã€åŸºé‡‘ã€‘ å¤§è¿žå¸‚ç§‘å¦æŠ€æœ¯å±€åŸºé‡‘èµ„åŠ©é¡¹ç›®(2020RT01);äº§ä¸šåŸºç¡€å†é€ å’Œåˆ¶é€ ä¸šé«˜è´¨é‡å‘å±•ä¸“é¡¹åŸºé‡‘èµ„åŠ©é¡¹ç›®(TC220A04A-49);ç”µåå…ƒå™¨ä»¶å®žéªŒå®¤å¯é æ€§ç‰©ç†ä¸Žåº”ç”¨æŠ€æœ¯ç§‘å¦æŠ€æœ¯åŸºé‡‘èµ„åŠ©é¡¹ç›®(6142806210302)

ã€æ–‡çŒ®å‡ºå¤„ã€‘ å‰æž—å¤§å¦å¦æŠ¥(ä¿¡æ¯ç§‘å¦ç‰ˆ) ,Journal of Jilin University(Information Science Edition) , ç¼–è¾‘éƒ¨é‚®ç®± ,2024å¹´02æœŸ

ã€åˆ†ç±»å·ã€‘TP183
ã€ä¸‹è½½é¢‘æ¬¡ã€‘4

çŸ¥ç½‘èŠ‚ä¸‹è½½

èŠ‚ç‚¹æ–‡çŒ®ä¸ï¼š

æœ¬æ–‡é“¾æŽ¥çš„æ–‡çŒ®ç½‘ç»œå›¾ç¤º:

æœ¬æ–‡çš„å¼•æ–‡ç½‘ç»œ

èŠ‚ç‚¹æ–‡çŒ®

èŠ‚ç‚¹æ–‡çŒ®

åŸºäºŽä¸‰å€¼å‘äºŒå€¼æ¼”åŒ–çš„BNNå‰ªæžæ–¹æ³•

BNN Pruning Method Based on Evolution from Ternary to Binary

æœ¬æ–‡é“¾æŽ¥çš„æ–‡çŒ®ç½‘ç»œå›¾ç¤º:

åŸºäºŽä¸‰å€¼å‘äºŒå€¼æ¼”åŒ–çš„BNNå‰ªæžæ–¹æ³•