节点文献
基于关联规则的安全特色关键词提取研究
New Security Feature Extraction Method Based on Association Rules
【摘要】 互联网中的不法分子为了逃避安全过滤,将不良信息中的文本进行变形将其在网络中散布。为了识别和过滤这些不良文本,首先,根据词同现和字符编码规则对文本进行初始识别,识别出没有词义但频繁出现的有害词串;然后针对这些有害词串中各字符相邻、有序、频繁出现的特点,提出一种关联规则新算法自学习提取特色主题词。实验表明,该方法可以改善传统方法无法识别变形主题词的现状,对关键字过滤和主题过滤提供补充,提高基于内容的安全过滤的效率。
【Abstract】 In order to prevent the spread of the ill metamorphosed texts in Internet which escaped from the traditional security filtering,a security feature extraction method is presented.The metamorphosed character features in the ill texts are analyzed,they are recognized according to the character co-occurrence and the different codes of the characters and symbols,and then the new algorithm of association rules is proposed to extract the feature terms.The experiments show that it can improve the current situation that the metamorphosed terms could not be identified using the traditional methods and improve the efficiency and the capability of feature identification as the complement of the topic filtering.
【Key words】 association rules; security filtering; feature extraction; transmogrified text;
- 【文献出处】 计算机工程与应用 ,Computer Engineering and Applications , 编辑部邮箱 ,2006年S1期
- 【分类号】TP393.02
- 【被引频次】10
- 【下载频次】248