节点文献
移动社群挖掘算法及统计特征研究
Research on Mobile Social Network Mining Algorithm and Its Statistical Properties
【作者】 邓周灰;
【导师】 赵明;
【作者基本信息】 贵州大学 , 概率论与数理统计, 2017, 硕士
【摘要】 随着移动手机用户快速普及,移动社群已经成为人们关注的一个热门话题。移动社群是移动互联网中相对密集的区域,其内部实体间存在着紧密连接,而社群区域之间相互隔离或者联系较少。本课题对移动社群理论技术的推动和移动社群营销技术的发展有重要的学术和应用研究价值。移动通话和短信数据是移动互联网时代中的两种重要和简单的移动通信数据。本文的数据来源于某地区中国移动通信用户的通话话单和短信记录,该样本数据包括移动脱敏用户100万个,总通话记录264万条,字段主要包括用户编号、通信时间、通信时长等。基于真实的通话和短信数据,本文对移动社群的挖掘算法及统计特征方法进行研究,其主要研究内容与创新点如下:移动社群网络的识别和关键用户的判别技术的研究,不仅对移动社群网络的理论发展有重要的研究价值,也对社会网络及社会计算的研究起到推动作用。本文提出了一种基于Potts Spin-glass(PSG)的移动社群识别模型,该模型的Hamiltonian模块度评估函数能够很好的对社群进行分类。在PSG模型的基础上,提出了基于Jaccards系数的关键用户判别模型,该模型对于评估移动关键用户间紧密性提供了一条有效途径。通过实证研究表明,本研究的移动用户数据在某些维度上具有“二八规则”中的普遍规律,例如90%以上的通讯(包括短信、通话)来自于大约10%的移动客户。在真实的移动通话和短信数据的基础上,本文将移动社群的特征分成4个类别,从原始通话和短信数据中提取了的8个移动社群度量指标,利用判别坐标分析方法对高维数据进行降维处理,建立移动社群特征预测模型,给出了移动社群特征预测的一般步骤,提出了移动社群特征分析的新方法。实验结果表明了提出模型的可行性和有效性,其预测准确率达到95%。
【Abstract】 In recent years,due to the popularity of mobile devices,mobile social network has become a popular topic.Mobile social network is relatively dense area of mobile Internet.There is a close connection between the internal entities,and less connection between the mobile social networks.Therefore,this studies have an important academic value on mobile social network technique and application value on mobile social network marketing.Mobile call and message data are the most common information during the period of mobile internet era.To deeply understand the behavior mode of the mobile phone network,mobile call and message records from over 1 million individuals about 2.64 million samples are analyzed by using statistical methods.The sample data contains some variables,such as user ID,opposite user ID,call and message time and call duration.Using real call and message detail records,this thesis studies community discovery algorithm and statistical properties of mobile social network.The main research achievement and innovations in this thesis are listed below:The research on mobile social network discovery and key user discriminant techniques not only promote the development of communication community theory,but also improve the methodology of social network and social computing.This thesis proposed mobile social network discovery modelling based on Potts Spin-glass(PSG)technique,and Hamiltonian module evaluation function is powerful for classify mobile social network.Moreover,key user discriminant modelling is proposed based on Jaccards coefficient,and the proposed modelling is an effective method for evaluating the relatedness between users.Finally,the statistical analysis revealed that the 80/20 rule is also exited in the mobile social network,such as more than 90% communications including SMS and calls are from about 10% of the mobile users.In this thesis,the eight variables are extracted from sample data set,and the mobile social network statistical properties is categorized into four groups.Furthermore,a mobile social network statistical properties predication modelling is proposed based on discriminant coordinates analysis.In the approach,the high dimension data is reduced to two dimensions,and a general predication steps of mobile social network statistical properties is described.An empirical research shows that the proposed modelling is feasible and effective,and the modelling has a forecast accuracy of 95%.
【Key words】 Mobile social network; Mobile phone data; Network discovery; Key user; Discriminant coordinates analysis;