节点文献

基于多维语义特征与层次注意力机制的讽刺识别

Sarcasm recognition based on multi-dimensional semantic features and hierarchical attention mechanism

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 宋留静赵泽方马宇翔申罕骥李俊

【Author】 SONG Liujing;ZHAO Zefang;MA Yuxiang;SHEN Hanji;LI Jun;Computer Network Information Center, Chinese Academy of Sciences;University of Chinese Academy of Sciences;School of Computer and Information Engineering, Henan University;

【通讯作者】 李俊;

【机构】 中国科学院计算机网络信息中心中国科学院大学河南大学计算机与信息工程学院

【摘要】 讽刺是一种复杂的语言表达方式,在日常交流中发挥着重要作用。随着人工智能和社交网络的快速发展,讽刺识别已成为自然语言处理领域的热点研究课题之一。现有的讽刺识别研究往往从单一维度对讽刺文本特征进行表示,忽视了讽刺文本特征的细微差异及其重要程度。本文将讽刺识别视为文本分类任务,在特征提取阶段,将讽刺文本根据其不一致性特征、情感特征、句法结构特征和风格特征进行多维语义特征表示。在特征融合阶段,针对不同维度特征对整体特征贡献和关联程度不同,采用层次注意力机制调整不同讽刺语言学特征对模型整体性能的影响。实验结果表明,所提出的模型能够从多个维度提取讽刺文本的潜在语义特征,其在公开数据集IAC、Tweets和Reddit上的实验性能均有明显提升。

【Abstract】 Sarcasm is a complex language expression that plays an important role in everyday communication. With the rapid development of artificial intelligence and social networks, making computers to automatically recognize sarcasm has become one of the hot research topics in the field of natural language processing. Existing research on sarcasm recognition often expresses samantic features from a single dimension, ignoring the subtle differences and importance of samantic features. This paper treats sarcasm recognition as a kind of natural language classification task, in the feature extraction stage, the sarcasm text is represented by multi-dimensional semantic features according to its inconsistency features, affective features, dependency structure features and style features. In the feature fusion stage, the hierarchical attention mechanism is used to adjust the impact of different samantic linguistic features on the overall performance of the model in view of the different contribution and correlation degree of different dimension features to the overall feature. The experimental results show that the proposed model can extract the latent semantic features of satirical text from multiple dimensions, bring a significant improvement on public datasets IAC, Tweets and Reddit.

【基金】 国家重点研发计划(2019YFB1405801);中国科学院对外合作重点项目(241711KYSB20180002);河南省重点研发与推广专项(222102210040)资助项目
  • 【文献出处】 高技术通讯 ,Chinese High Technology Letters , 编辑部邮箱 ,2024年05期
  • 【分类号】TP391.1;TP18
  • 【下载频次】27
节点文献中: 

本文链接的文献网络图示:

本文的引文网络