节点文献
基于Hadoop平台的流量异常检测的方法研究
Research on the Method of Traffic Anomaly Detection Based on Hadoop Platform
【作者】 胡洁;
【导师】 蔡琼;
【作者基本信息】 武汉工程大学 , 计算机技术, 2016, 硕士
【摘要】 流量异常指的是网络中的流量呈现不规则的明显变化。网络中间隙的拥塞、分布式拒绝服务攻击、端口扫描或者网络环境异常等都能够引起网络的异常。这些网络异常情况给运营商和用户都带来了很不好的体验,在这时候流量异常检测就突出了其必要性。互联网的告诉发展也带来了数据的急剧增长,而网络中的流量异常情况也随之增多。面对这种情况,传统的流量异常检测在针对大数据量方面具有很大的局限性了。而Hadoop在海量数据的处理上具有高效、高容错、高扩展和高可靠等优势。因此,流将量异常检测和Hadoop相结合能够更好的检测大的数据量中的流量异常现象。本文通过对网络中异常流量特征进行分析检测之后提出了一种通过对固定的包数进行流量特征分析的检测流量异常的方法。应用这种方法对流量特征的异常变化进行分析,我们可以得到下面的结论:在短暂的时间区间里正常的网络环境下流量会保持着一个相对的稳定性,而在异常的网络环境下流量的特征分布会呈现剧烈的变化,这会使得其产生不稳定性。通过这样的区别,我们可以对流量异常情况进行判断。在整体过程中,使用Hadoop的HDFS分布式文件系统存储数据文件,通过提出一种基于MapReduce流量异常检测的算法将其代入到MapReduce的计算模型之中。根据前面提出的算法,本文设计并提出了一种基于Hadoop的流量异常检测系统。通过对实验数据的对比分析我们可以确定该算法是可行与有效的。
【Abstract】 Traffic anomaly is significant changes in the network traffic rules.The short of congestion in the network,distributed denial of service attacks,a wide range of local events or network routing anomalies can cause network anomalies.These network anomalies brought very bad experience to operators and users,and traffic anomaly detection is very important at this time.The data have exploded with the rapid development of the Internet and the network traffic anomalies also increased.In the face of such situation,the traditional traffic anomaly detection has great limitations on the large amount of data.Hadoop has the advantages of high efficiency,high fault tolerance,high scalability and high reliability in the processing of massive data.Therefore,combining traffic anomaly detection and Hadoop can better detection of the large amount of data traffic anomalies.In this paper,through analyzing the characteristic of network anomaly traffic detection after put forward an improved analysis method based on the traffic characteristics of packet data.Applying this method analyze the traffic characteristics of the anomaly changes,wo can get the following conclusion:The normal data traffic will have a certain stability in the small time interval,and the data tarffic anomaly will make it part of the distribution of the traffic characteristics have a significant change,it will make it produce instability.By this difference,we can judge the traffic anomaly.In the whole process,By using Hadoop’s HDFS distributed file system to store data files and using MapReduce computing model achieve the based on traffic anomaly detection algorithm.This calculation model simplifies the calculation steps and enhances the real-time performance of anomaly detection.In this paper,using the algorithm proposed design the detection system to detect anomaly traffic,and then through the simulation experiment to test and analyze results with the expected implementation results,Finally,it is verified that the method is feasible and effective.
【Key words】 traffic anomaly; anomaly detection; Hadoop; MapReduce; traffic characteristics;