节点文献

社会网络中用户影响力分析技术研究

Research on Analyzing Influence of Users in Social Network

【作者】 张玥

【导师】 张宏莉;

【作者基本信息】 哈尔滨工业大学 , 信息安全, 2015, 博士

【摘要】 互联网上社会性媒体迅速发展。社会性网络媒体具有话题多变且演化特性,用户海量性,人们交互过程中所形成的关联网络结构复杂且动态变化。由于社会性网络媒体的开放性和低门槛性,以及人群素质的巨大差异以及群众诉求表达方式的多面性,信息传播的快速性,传播范围的不可控性都给社会化网络媒体的管理带来了新问题。媒体中的高影响力用户非常活跃且传播正面信息。识别社会化媒体中的高影响力用户,利用其对社会舆论进行正面宣传引导,对维护社会安定、管理社会秩序、使网络信息走向健康发展具有重要作用和价值。本文研究社会性网络中影响力分析的关键技术,主要内容包括:(1)分析影响力的多维属性及演化规律。在社会化网络中影响力具有社会性、网络性以及动态变化特征。提出影响广度、影响深度和影响持续程度三维量化指标,在社会性和网络性两个角度分析基础上综合量化影响力。综合用户在网络中关联性、用户间交互强度和文章质量所决定的交互深刻程度三种行为特征,多角度评价用户影响力。本文改进Page Rank算法,先从整体角度建立用户间关联矩阵,再从局部角度分析用户行为,建立用户影响力综合分析模型。改进个性化指向增强关系,改进后提高了用户间关联紧密度。在影响力动态变化特性上,从长期性和稳定性角度分析影响力演化模式。对天涯网络论坛数据集的实验结果表明,部分高影响力用户是论坛中的活跃用户,并且长期具有高影响力用户实际引导网络论坛舆论走向。(2)研究影响机理与网络结构间的关系。处于网络中关键位置的节点产生重要作用,研究结构洞用户的识别和影响力分析算法。结构洞是不同社区间的关联区域,网络社区和结构洞是从两个不同角度表示网络的宏观属性,在划分社区的同时考察节点的结构洞特性。结构洞用户在网络中的关联作用不同于社区内高影响力用户。采用多级划分思想识别社区,为了快速合并局部关联紧密区域,提出多种节点合并模式进行图压缩并采用导率函数确定社区边界,在压缩图进行精确划分基础上再进行回溯识别社区,算法复杂度低且能实现大规模图的有效划分。依据结构洞用户在社区内外关联性,提出结构洞用户影响力分析算法。(3)社会化网络的大规模性给影响力分析带来了挑战,研究影响力和网络结构特征的关联关系,提出快速影响力分析方法。对天涯网络论坛真实数据集的统计结果表明,入度和出度分布具有幂律性,且入度为0的用户日平均比例86.16%,出度为1的用户日平均比例64.98%。本文从降低数据存储空间和提高运行效率出发,在真实的网络关联图上,利用度关联的偏斜特性,提出基于集合划分的数据高效节约存储及快速用户排序算法SD-Rank。SD-Rank根据入度是否为0划分为两个集合。对入度为0集合按出度构造链接表改进Page Rank算法。SD-Rank时空复杂性为O(V’),V’为入度非0节点集。SD-Rank排序效率比Page Rank算法提高了39%,并且SD-Rank算法运算速度高于基于矩阵的非冗余正交基计算的Colibri算法运算速度。(4)研究影响力与演化信息间的关系。针对影响力随事件的动态变化特性,分析用户在共同参与主题中的显式和隐式关联、用户在事件不同发展阶段的影响强度,在事件演化分析过程中从这两个方面分析用户的动态影响。基于用户在话题上具有隐式关联的前提,首先提出多种距离度量方法,并在此基础上提出全局性事件和子事件识别方法。在主题扩散和演化过程中,分析事件的扩散过程和演化趋势,区分即时影响和延迟影响,延迟影响随时间呈指数衰减。最后,在用户间交互所产生的直接影响和话题内间接影响基础上,分析话题和用户影响力的波动性以及二者的峰值共现性,提出基于热点事件演化的用户影响力计算模型。在论坛和微博数据集上,基于演化的热点话题发现算法精确率和召回率高于静态视图的识别率。本文算法很好地利用了用户在网络中的全局关联关系、有机结合了热点事件与用户影响的峰值性,在多种比较算法中获得了最好的影响力排序结果。

【Abstract】 Social Media becomes more and more popular with the development of Internet, which has characters of mass of users, evolutionary topics, dynamic network structures. Its shortcomings are also very clear: openness of network brings out low threthhold, difference of cognition brings out many angles of representing disapprovals, quickly spreading of information brings out uncontrollability of communication range, all these characters present both opportunities and challenges for the government management. It is important to identify influential or authoritative users in social media, because influential users produce more information and lead to a healthy opinion. Mining influential users to enlarge those information dissemination is important to help public stability and harmonious order. This dissertation takes BBS and Weibo as example, to study the key technologies of social influential mechanism in social network. The main work and contributions in this dissertation includes:(1) Social influence has social, network correlation, and dynamic features. First we analyzed key factors of influence, comprehensive influence analyzing on the basis of society and network, which are breadth, deepness and duration. Complex correlation of users in social network, interaction strength between users and deeply interaction by article’s formulation are behavior features of social activities, influential users analyzing algorithm based on multiple attributes. Correlation matrix is constructed by users’ reply and indirect relation is analyzed. User’s behavior is evaluated from local view. Based on the Pagerank algorithm, introducing user’s behavior feature and the user relational networks, this paper designs a multiple attributes rank(MAR) algorithm. Deficiency of Page Rank is that rank score increased just by link direction, which ignored the difference between link’s strength. We introduced personalized enforcement ranking mechanism, which improved closeness between influential users. Influential user’s evolution trend is analyzed by time span division and stability. We conducted experiments with data from Tianya BBS, and evaluated multi-facets of issues of identifying influential users. Conclusion is arrived that some influential users are active users in forum and influential users have real leading duty.(2) Nodes are more important when locating in special place of network, we studied algorithms of structure hole users identification and analyzing. Structure hole based on influential user identification: Structure hole is very important for in network, which has opportunity to connect different community. Community and structural hole are different part of global characters of network. Structure hole connect different communities with weak strength sparsely. It is NP-hard question to identify network community. We can find structure hole when we search community. Users located in structure hole played more important role than influential user in community. In this paper, we proposed algorithm to identify community and structure hole synchronously. With multi-level hierarchical partition mechanism, we proposed merging patterns to condense network graph with local clusters, identify communities on condensed network and identify precisely on uncoarsed network, which was more efficient to identify communities in large graphs. Based on internal correlation and external correlation, we modified HITS algorithm to compute connection score of inner-community and outer-community.(3) Large scale social netork brings forth great challenges on influence analyzing, we studied correlation of influence and the characteristics of network structure, proposed quickly influence analyzing algorithm based on degree distribution. Average ratio of 0-indegree is 86.16%, average ratio of 1-outdegree is 64.98% in TIANYA BBS dataset. Indgree and outdegree follow power law. To decrease time-space ratio of influence computing, this paper analyzed relational graph of users in network BBS and deviation distribution of degree. Based on Pagerank, users are divided into two sets, 0-indegree in set1, non 0-indegree in set2, edges are from set1 pointing to set2 or between set2. The nodes in set1 are divided by out degree, nodes are listed together when the out degree is same. The quickly sorting algorithm is presented by applying set division method and list structure. With set partition algorithm based on degree distribution, time-space complexity decreaced from O(V+E) to O(V’), V’ is set of non 0-indegree. Experimental results on TIANYA BBS dataset demonstrate that SD-Rank is more efficient than Page Rank and Colibri which computing non-redundant orthogonal basis.(4) Influence is dynamic with event evoving, users bring explicit and implicit correlation, and different influence strength with different stage of event. Users influenced others by means of topic, reply is explicit correlation, participation in common subjects brings imlicit correlation. This paper proposed evolutionary influence algorithm with event evolving. Global event and subevent were detected based on everal distance functions which proposed to measure implicit correlation in this paper. Hot events are detected by the way of co-currence words in subjects. Event is progressive with the changing of co-currence words in subjects. User communities exist in relavant subjects implicitly, complicate techniques designed to identify user communities. Information diffusion has two characters: correlation and evolution. In this paper, evolutionary event based user influence ranking algorithm is proposed. During process of event diffusion and evolution, differentiate difference in instant influence and deferring influence during the course of topic diffusing and evolving, for deferring influence decaying exponentially with time. Although topic influence and user influence fluctuate asynchronous, concurrent peak of topic influence and user influence means influence mutual enforcement. The comprehensive analysis of reasons above brings about computational model of evolving influence based on dynamic event. Recall and precision in this paper are higher than those with static analyzing in datasets of forum and weibo. For global correlation and peak concurrent of hot event with user influence are considered, the proposed algorithm was more effective than others.

  • 【分类号】G206;TP393.07
  • 【被引频次】7
  • 【下载频次】761
  • 攻读期成果
节点文献中: 

本文链接的文献网络图示:

本文的引文网络