节点文献
Improving Ordinal Regression Model for Rating Inference through Optimizing Training Sample Selection
【Author】 Huizhen Wang~(1,2+),Jingbo Zhu~(1,2) 1 Key Laboratory of Medical Image Computing(Northeastern University),Ministry of Education 2 Natural Language Processing Laboratory,Northeastern University,Shenyang,Liaoning,P.R.China,110819
【机构】 Key Laboratory of Medical Image Computing(Northeastern University),Ministry of Education; Natural Language Processing Laboratory,Northeastern University;
【摘要】 <正>This paper addresses the content-based rating inference issue in sentiment analysis,in which the user’s assessments on social issues or products are determined with respect to multi-point scale instead of polarity (positive or negative).The most common way to tackle rating inference problem is to utilize machine learning algorithms such as ordinal regression models.In practice,different reviews on the same object are generally provided by different users,and the rating annotations provided by different users are often not consistent.In such cases,standard ordinal regression algorithms would fail due to the inconsistent rating annotation problem. To address this challenge,this paper proposes two approaches to improving standard ordinal regression algorithms by optimizing sample selection for training,including tolerance-based selection and ranking-loss-based selection methods.Experiments on two publicly available English and Chinese restaurant review datasets demonstrated significant improvements over standard algorithms.
【Abstract】 This paper addresses the content-based rating inference issue in sentiment analysis,in which the user’s assessments on social issues or products are determined with respect to multi-point scale instead of polarity (positive or negative).The most common way to tackle rating inference problem is to utilize machine learning algorithms such as ordinal regression models.In practice,different reviews on the same object are generally provided by different users,and the rating annotations provided by different users are often not consistent.In such cases,standard ordinal regression algorithms would fail due to the inconsistent rating annotation problem. To address this challenge,this paper proposes two approaches to improving standard ordinal regression algorithms by optimizing sample selection for training,including tolerance-based selection and ranking-loss-based selection methods.Experiments on two publicly available English and Chinese restaurant review datasets demonstrated significant improvements over standard algorithms.
- 【会议录名称】 第六届全国信息检索学术会议论文集
- 【会议名称】第六届全国信息检索学术会议
- 【会议时间】2010-08-12
- 【会议地点】中国黑龙江牡丹江
- 【分类号】TP18
- 【主办单位】中国中文信息学会信息检索与内容安全专业委员会