节点文献

尺度维条件下空间数据的可视化聚类挖掘研究与应用

Study and Application on Visual Multi-scale Spatial Clustering Based on Graph Theory

【作者】 涂建东

【导师】 陈崇成;

【作者基本信息】 福州大学 , 地图学与地理信息系统, 2005, 硕士

【摘要】 近年来,随着遍布全球的空间信息自动监测网络的完善,人们获取与搜集数据的能力越来越强,大量多源异构数据储存在空间数据库、关系数据库和数据仓库中。从某种意义上说目前我们不是缺少信息,而是被信息淹没了。因此如何发现各种大型空间数据库中所隐藏的、预先未知的信息以支撑相应的应用显得尤为重要,这就是目前空间数据挖掘的任务。作为空间数据挖掘的一个重要分支,空间聚类与传统聚类挖掘的不同在于空间数据本身的特性(高维、多尺度、海量等)以及空间对象之间具有复杂的关系(空间拓扑关系、方位关系和度量关系等),这些使得传统面向关系数据库和数据仓库的聚类挖掘算法不适用于空间数据库。本文从地理信息系统(GIS)的角度研究空间聚类挖掘算法,将空间数据两个重要特征(空间拓扑特性、空间多尺度特征)与可视化交互技术融入挖掘过程中,从空间邻接关系、多尺度两个点切入,初步研究空间聚类挖掘原理与方法。主要的研究内容和结果如下: 1)阐述了空间数据结构、空间数据挖掘和空间聚类挖掘的相关理论和概念,并对现有的空间聚类挖掘算法做了系统分析与研究。2)讨论了空间数据预处理及相关可视化技术,并结合实例给出了建立空间多维立方体与进行在线联机分析处理(OLAP)的方法。3)研究了基于空间邻接关系的空间聚类挖掘算法VSG-CLUST。该算法是一种基于图分割的可视化空间聚类算法。其基本原理是利用Delaunay 三角网与MST(最小生成树)两种图论工具将地理实体的邻接信息(空间相邻关系)加入并参与到空间聚类中,以维护聚类对象的基本空间结构特征,引导聚类过程得到既保证类内对象在属性上的相似性,又保证其在空间上邻接的聚集簇。算法的可行性与有效性得到验证。4)研究了利用多尺度的空间概念层次关系进行空间聚类挖掘的算法。认为空间层次特征本质上是空间数据多尺度性质的表现,考虑尺度维条件下的空间聚类算法可将尺度因素作为一种约束条件施加于VSG-CLUST算法中MST的分割和修剪策略,即一种基于尺度因素约束的空间层次聚类挖掘算法。5)最后,在上述理论研究和算法实现的基础上,给出了课题小组成员共同实

【Abstract】 With the development of worldwide automatic data collecting network, the ability to collecte data is getting greater, and large amount of data are stored in spatial databases, relational databases and data warehouses. In some sense, we are submerged in data rather than lack of data. This situation creates the necessity of an automated knowledge/information discovery from data, which leads to a promising emerging field, spatial data mining or spatial knowledge discovery in databases (SKDD). Spatial knowledge discovery in databases can be defined as the nontrivial extraction of implicit, previously unknown, and potentially useful information from spatial data. Spatial clustering is one of data mining methods. Extracting interesting and useful patterns from spatial datasets is more difficult than extracting patterns from traditional numeric and categorical data due to the characteristic of spatial data (e.g. high-dimension, multi-scale, large amount etc.) and the complexity of spatial relationships(e.g. spatial topological relation, spatial orientation relation, spatial measurement relation etc.). In general the algorithms of clustering in RDBMS are inapplicable in GIS. In this paper, spatial clustering is studied from the perspective of GIS. We combines two major characteristic of spatial data(spatial topological relation and multi-scale property) and visualization techniques with clustering processing.We first study an efficient method for spatial clustering that takes into account the effect of spatial relationship, then extend it to mutli-scale spatial datasets. The major research results and conclusions of this paper are as follows: (1) We describe spatial data structure, related concepts and theory of spatial data mining and spatial clustering. Especially we divide spatial clustering algorithm into five classes, and make further research on them respectively. (2) We discusse spatial preprocessing techniques and related visualization techniques. We show a method to contruct spatial data cube and perform on-line analysis process (OLAP) with sample data. (3) We develop a novel visual spatial clustering method named VSG-CLUST, which is able to recognize spatial patterns that involve neighbors. Its principle is to maintain the spatial structure with the help of graph theory tool includes Delaunay Triangulation (DT) and Minimum Spanning Tree (MST). VSG_CLUST groups and visualizes cluster hierarchies consisting of both non-spatial and spatial attributes. The usability and effectiveness of VSG-CLUST is presented. (4) We propose a new multi-scale spatial clustering method based on spatial concept tree. We believe that spatial hierarchical characteristic represents spatial multi-scale characteristic. Hence, we regard multi-scale factor as a kind of constraint, and apply it to control the strategy of partitioning a MST, namely a spatial hierarchical clustering based on multi-scale constrain. (5) Employing the theoretical research mentioned above, a data mining system called Hsminer was implemented using the web services technology as the platform. An application instance of Hsminer is attached to the end of this paper. In this instance, we use VSG-CLUST algorithm to mine the rules in Fujian province environmental monitoring data.

  • 【网络出版投稿人】 福州大学
  • 【网络出版年期】2005年 08期
  • 【分类号】P208
  • 【被引频次】5
  • 【下载频次】404
节点文献中: 

本文链接的文献网络图示:

本文的引文网络