节点文献

数据挖掘在煤炭价格预测中的应用

The Application of Data Mining in the Coal Price Forecast

【作者】 张迎春

【导师】 张燕平;

【作者基本信息】 安徽大学 , 计算机应用技术, 2006, 硕士

【摘要】 随着现代科技的发展,我们的社会越来越信息化,各种大型的数据库软件也走进了各种企业,这为信息的规范管理提供了一定的有利条件。然而,面对如此大量的数据也伴随着一些问题出现,最常见的就是所谓的“信息爆炸,但知识贫乏”,这表明现在的社会中信息量已经是非常的庞大,但是它们被利用的很少。目前对于数据挖掘技术的研究越来越多,并且已在多个领域中应用,其应用范围涉及银行、电信、保险、交通等诸多领域。而预测作为数据挖掘技术的一个重要的组成部分也受到广大研究学者的关注。 在当今经济社会中正确的预测具有重要的作用,它可以帮助一个企业或单位做出正确的决定从而改善效益。在本文中作者主要讨论数据挖掘技术在煤炭价格预测中的应用,这里的煤炭价格主要是指发电厂购买煤炭的价格。我国是主要依靠火力发电的国家,主要使用的能源就是煤炭。对电厂而言,煤炭资源的储备十分重要,它关乎发电厂资金的合理安置,是电力能源供应的保证。 聚类和分类是两种不同的预测方法,本文主要用这两个方法来对煤炭价格的预测进行探讨。 聚类是人类一项最基本的认识活动,通过适当的聚类,事物才能便于研究。现有常用的聚类方法有k-means算法,其缺点是k的值需要事先给定,并且其聚类结果与初始值的选择有较大的关系;LBG算法也是一种常用的算法,它具有较好的聚类效果,但是其具有聚类时间较长和容易陷入局部极小的缺点。 分类是另一种最基本的认知形式,作为数据挖掘的一个重要主题,在统计学、机器学习、人工智能等领域中发展较早。近几年来,人们开始将它与数据库技术相结合,解决实际问题。现有的分类预测的方法有许多种,常见的有决策树算法(C4.5)、贝叶斯分类算法、BP算法与支持向量机等。但是它们都有各自的不足,前三者在实验结果与速度方面都有待改进,而支持向量机虽然有较好的准确度,尤其是对于小样本以及非线性数据具有较大的优势,但是其结果可解释性差,核函数的确定未能给出完整的方法。张铃教授在理论上证明了SVM与三层前向神经网络在识别能力上的相似性,并将核函数的思想引入到交叉覆盖算法,提出了核覆盖算法。该算法进一步优化了覆盖算法,从而提高了覆盖算法的精度。

【Abstract】 Along with the development of modern science and technology, our society is more and more informatized and many large-scale database softwares are introduced into enterprises. It is in favor of criterion management of information. However, many problems go with huge data, and "information exploded, but poor knowledge" is well-known, which means that the information of society is large, but the application of them is little. Hence, the technologies of data mining are researched and applied to many fields including bank, telecom, insurance, traffic and so on. Forecast as the important part of data mining is studied by many researchers.In our economic society, the exact forecast is important because many right decisions of enterprises should be made out. In this dissertation, the technologies of data mining are discussed which are applied to coal price forecast., in where coal price means the purchasing-price for Power Plant. In our country, most of Power Plants are firepower plants, so coal is the primary energy sources. For power plants, the repertory of coal is important which affects the allocation of funds and assures the supply of electric power.Clustering and classification are the two different methods of forecast. In this dissertation, we mainly discuss about the forecast of coal price by using the two mothods.Clustering is one of the basic cognition actions. Objects are convenient for research by proper clustering. The k-means algorithm is a normal clustering method. And its disadvantage is that the value of k should be given in advance which affects the result of clustering. LBG is also a normal algorithm which has a better effect and a disadvantage that the clustering time is too long and it’s easy to get into local minimum。Classification is another basic cognition function. As an important topic of data mining, data classification developed early in statistics, machine learning, artificial intelligence and so on. Recently, it is combined with database technologies to solve practical problems. There are many classification methods to forecast such as decision tree algorithm (C4.5)、Bayes algorithm、BP algorithm and SVM. All of them have their own disadvantages. The first three should be improved in result and speed. Although the SVM has superior recognition rate especially for small samples and

  • 【网络出版投稿人】 安徽大学
  • 【网络出版年期】2006年 12期
  • 【分类号】TP311.13
  • 【被引频次】11
  • 【下载频次】653
节点文献中: 

本文链接的文献网络图示:

本文的引文网络