èŠ‚ç‚¹æ–‡çŒ®

åŸºäºŽæ·±åº¦å¼ºåŒ–å¦ä¹ çš„é«˜æ€§èƒ½å¯¼å‘æ€§æ¨¡ç³Šæµ‹è¯•æ–¹æ¡ˆ

High-performance directional fuzzing scheme based on deep reinforcement learning

æŽ¨è CAJä¸‹è½½
PDFä¸‹è½½
ä¸æ”¯æŒè¿…é›·ç‰ä¸‹è½½å·¥å…·ï¼Œè¯·å–æ¶ˆåŠ é€Ÿå·¥å…·åŽä¸‹è½½ã€‚

ã€ä½œè€…ã€‘ è‚–å¤©ï¼› æ±Ÿæ™ºæ˜Šï¼› å”é¹ï¼› é»„å¾ï¼› éƒæ·ï¼› é‚±å«ä¸œï¼›

ã€Authorã€‘ XIAO Tian;JIANG Zhihao;TANG Peng;HUANG Zheng;GUO Jie;QIU Weidong;School of Cyber Science and Engineering, Shanghai Jiao Tong University;Columbia University;

ã€é€šè®¯ä½œè€…ã€‘ é‚±å«ä¸œ;

ã€æœºæž„ã€‘ ä¸Šæµ·äº¤é€šå¤§å¦ç½‘ç»œç©ºé—´å®‰å…¨å¦é™¢ï¼› å“¥ä¼¦æ¯”äºšå¤§å¦ï¼›

ã€æ‘˜è¦ã€‘ éšç€ç§»åŠ¨äº’è”ç½‘ä¸Žä¿¡æ¯æŠ€æœ¯çš„å¿«é€Ÿå‘å±•ï¼Œè¶Šæ¥è¶Šå¤šçš„åº”ç”¨ç¨‹åºèžå…¥äººä»¬çš„ç”Ÿæ´»ï¼Œä½†è¿™äº›åº”ç”¨ç¨‹åºä¸å˜åœ¨çš„æ¼æ´žä¸¥é‡å¨èƒç€ç”¨æˆ·éšç§å’Œä¿¡æ¯å®‰å…¨ã€‚è¿‘å¹´æ¥ï¼Œæ¨¡ç³Šæµ‹è¯•ä½œä¸ºæµè¡Œçš„æ¼æ´žæŒ–æŽ˜æŠ€æœ¯ä¹‹ä¸€ï¼Œå› å…¶æ¼æ´žæ˜“å¤çŽ°ä¸”è¯¯æŠ¥çŽ‡ä½Žçš„ç‰¹ç‚¹è€Œè¢«å¹¿æ³›åœ°ä½¿ç”¨ã€‚å®ƒèƒ½éšæœºç”Ÿæˆæµ‹è¯•ç”¨ä¾‹å¹¶æ‰§è¡Œç¨‹åºï¼Œé€šè¿‡è¦†ç›–çŽ‡æˆ–æ ·æœ¬ç”Ÿæˆæ–¹é¢çš„ä¼˜åŒ–ä»¥æ£€æµ‹æ›´æ·±çš„ç¨‹åºè·¯å¾„ã€‚ä½†æ˜¯æ¨¡ç³Šæµ‹è¯•ä¸çš„å˜å¼‚æ“ä½œå˜åœ¨ä¸€å®šçš„ç›²ç›®æ€§ï¼Œæ˜“ä½¿ç”Ÿæˆçš„æµ‹è¯•æ ·æœ¬æ‰§è¡Œç›¸åŒç¨‹åºè·¯å¾„ã€‚å› æ¤ä¼ ç»Ÿæ¨¡ç³Šæµ‹è¯•æ™®éå˜åœ¨æŒ–æŽ˜æ•ˆçŽ‡ä½Žã€è¾“å…¥æž„é€ çš„éšæœºæ€§å¼ºã€ç®—æ³•å¯¹ç¨‹åºç»“æž„é’ˆå¯¹æ€§æœ‰é™ç‰é—®é¢˜ã€‚é’ˆå¯¹ä¸Šè¿°é—®é¢˜ï¼Œæå‡ºäº†åŸºäºŽæ·±åº¦å¼ºåŒ–å¦ä¹ çš„é«˜æ€§èƒ½å¯¼å‘æ€§æ¨¡ç³Šæµ‹è¯•æ–¹æ¡ˆï¼Œé€šè¿‡ç¨‹åºæ’æ¡©ç‰æ–¹æ³•èŽ·å–ç¨‹åºè¿è¡Œæ—¶çš„ä¿¡æ¯ï¼Œä½¿ç”¨æ·±åº¦å¼ºåŒ–å¦ä¹ ç½‘ç»œæŒ‡å¯¼æ¨¡ç³Šæµ‹è¯•é€‰æ‹©æµ‹è¯•æ ·æœ¬ï¼Œç”Ÿæˆæœ‰é’ˆå¯¹æ€§å’Œå¯¼å‘æ€§çš„æµ‹è¯•æ ·æœ¬ä»¥å¿«é€Ÿé€¼è¿‘å¹¶æ£€éªŒå¯èƒ½å˜åœ¨æ¼æ´žçš„ç¨‹åºè·¯å¾„ï¼Œä»Žè€Œæé«˜æ¨¡ç³Šæµ‹è¯•çš„æ•ˆçŽ‡ã€‚å®žéªŒè¡¨æ˜Žï¼Œåœ¨LAVA-Mæµ‹è¯•é›†ä¸ŽçœŸå®žåº”ç”¨ç¨‹åºLibPNGå’ŒBinutilsä¸Šï¼Œæ‰€ææ–¹æ¡ˆæ¯”æµè¡Œæ¨¡ç³Šæµ‹è¯•å·¥å…·AFLä¸ŽAFLGOåœ¨æ¼æ´žæ£€æµ‹ä¸Žå¤çŽ°ç‰æ–¹é¢æœ‰ç€æ›´å¥½çš„è¡¨çŽ°ï¼Œå› æ¤è¯¥æ–¹æ¡ˆå¯ä¸ºä»ŠåŽçš„æ¼æ´žæŒ–æŽ˜å’Œå®‰å…¨ç ”ç©¶æä¾›æ”¯æ’‘ã€‚æ›´å¤š è¿˜åŽŸ

ã€Abstractã€‘ With the continuous growth and advancement of the Internet and information technology, continuous growth and advancement of the Internet and information technology. Nevertheless, these applicationsâ€™ vulnerabilities pose a severe threat to information security and usersâ€™ privacy. Fuzzing was widely used as one of the main tools for automatic vulnerability detection due to its ease of vulnerability recurrence and low false positive errors. It generates test cases randomly and executes the application by optimization in terms of coverage or sample generation to detect deeper program paths. However, the mutation operation in fuzzing is blind and tends to make the generated test cases execute the same program path. Consequently, traditional fuzzing tests have problems such as low efficiency, high randomness of inputs generation and limited pertinence of the program structure. To address these problems, a directional fuzzing based on deep reinforcement learning was proposed, which used deep reinforcement learning networks with information obtained by staking program to guide the selection of the inputs. Besides, it enabled fast approximation and inspection of the program paths that may exist vulnerabilities. The experimental results showed that the proposed approach had better performance than the popular fuzzing tools such as AFL and AFLGO in terms of vulnerability detection and recurrence on the LAVA-M dataset and real applications like LibPNG and Binutils. Therefore, the approach can provide support for further vulnerability mining and security research.æ›´å¤š è¿˜åŽŸ