节点文献

作者写作特征提取引擎(英文)

A fingerprint engine for author profiling

  • 推荐 CAJ下载
  • PDF下载
  • 不支持迅雷等下载工具,请取消加速工具后下载。

【作者】 董乃鹏赵合计SCHOMMER Christoph

【Author】 DONG Nai-peng1,ZHAO He-ji1,SCHOMMER Christoph2(1. Department of Computer Science and Technology,Shandong University,Jinan 250101,China;2. Department of Information and Computer Sciences,Luxembourg 2311,Luxembourg)

【机构】 山东大学计算机科学与技术学院卢森堡大学信息与计算机学院

【摘要】 随着计算机网络的发展,电子文章逐渐繁荣.电子文章版权保护近年来也越来越受关注.电子文章版权保护的一个解决方案是,首先提取一个作者的写作特征,通过写作特征的比较来判断版权所属.目前作者特征提取方向的研究多集中在寻找新的更有效的特征上.如何更加有效的提取一个作者的写作特征仍是一件富有挑战性的工作.本文建立了一个作者特征提取引擎模型,该引擎以某个作者某一类型的文章作为输入,以该作者在这一类型文章上的写作特征为输出.应用这个引擎模型,在可能的作者列表中,可以确定一篇文章倾向属于某个作者的可能性.本文主要对英文文章进行特征提取.作者的特征通过各种语言学上特征和语言学度量来表示,并采用标准差和主成分分析法分析这些特征的有效性.

【Abstract】 With the development of the internet,digital texts are proliferating. Protection of a copyright has become increasingly important in recent years. To solve the copyright problem,one way is to profile an author’s writing style. By comparing writing styles,we could tell whether a text has been written by a certain author. Most of the current researches in author profiling focused on examining linguistic attributes or finding new attributes. However,the appropriate profiling of an author is still a challenging task. This paper aims to build a model to fingerprint an author,and took texts of an author of a certain domain as input and produced a profile of the author as output. Using this fingerprint engine we can tell with a certain probability whether an input text has been written by an author among a list of possible authors. This paper focused on author profiling of English texts. Writing styles were measured using linguistic attributes and linguistic measurements. Statistical methods,such as standard deviation analysis and principal components analysis,were used to evaluate the linguistic measurement′s efficiency.

  • 【文献出处】 山东大学学报(工学版) ,Journal of Shandong University(Engineering Science) , 编辑部邮箱 ,2009年05期
  • 【分类号】TP391.1
  • 【被引频次】1
  • 【下载频次】81
节点文献中: 

本文链接的文献网络图示:

本文的引文网络