节点文献

强化语义一致性的差分隐私文本脱敏方法

A Differential Privacy Text Desensitization Method for Enhancing Semantic Consistency

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 关业礼罗森林潘丽敏张笈于经纬

【Author】 Guan Yeli;Luo Senlin;Pan Limin;Zhang Ji;Yu Jingwei;School of Information and Electronics, Beijing Institute of Technology;North Institute for Scientific and Technical Information;

【机构】 北京理工大学信息与电子学院北方科技信息研究所

【摘要】 文本脱敏是一种极为重要的隐私保护方法,其隐私保护效果和与原文本语义一致性的平衡是一个难题.现有差分隐私脱敏方法对敏感词脱敏时,采用相似性计算概率法选取敏感词的替代词,易造成替代词与原文语义不一致甚至无关,严重影响脱敏文本对原文语义的保持.提出一种强化语义一致性的差分隐私文本脱敏方法,给定一种截断距离度量公式调整替换词选中概率限制语义无关替换词.真实数据集的实验结果表明,该方法有效提升了脱敏文本与原文的语义一致性,实际应用价值大.

【Abstract】 Text desensitization is an extremely important privacy protection method, and the balance between its privacy protection effect and semantic consistency with the original text is a challenge. When existing differential privacy desensitization methods are used to desensitize sensitive words, the similarity calculation probability method is used to select substitute words for sensitive words, which can easily cause inconsistency or even irrelevance between the substitute words and the original text semantics, seriously affecting the preservation of the original text semantics in the desensitized text. A differential privacy text desensitization method is proposed to enhance semantic consistency. A truncation distance measurement formula is given to adjust the probability of selecting replacement words and limit semantic irrelevant replacement words. The experimental results on real datasets show that it effectively improves the semantic consistency between desensitized text and the original text, and has great practical application value.

【基金】 国家重点研发计划项目(2018YFC2000300)
  • 【文献出处】 信息安全研究 ,Journal of Information Security Research , 编辑部邮箱 ,2024年08期
  • 【分类号】TP391.1;TP309
  • 【下载频次】29
节点文献中: 

本文链接的文献网络图示:

本文的引文网络