节点文献
空间数据库中离群点的度量与查找新方法
New Approach of Spatial Outliers Measurement and Detection in Spatial Databases
【摘要】 如今查找离群点的方法有以下两类第1类方法是面向统计数据库,把各种数据都看成是多维空间,没有区分空间维与非空间维的方法;第2类方法是面向空间数据库,区分空间维与非空间维的方法。目前提出的方法大多数是第1类方法,由于这类方法在空间数据库中直接应用可能产生错误的判断或找到无意义的离群点,而已有的第2类方法又查找效率太低或不能查找局部离群点,为此提出了一个新的基于邻域的离群点度量方法——空间偏离因子,这种方法面向空间数据库,不但可区分空间维与非空间维,并可以找到局部或全局的离群点;同时提出一种与邻域划分相结合的快速查找算法。理论分析表明,该方法是合理的。真实数据与模拟数据的实验也再次验证了这个模型与算法的可行性。
【Abstract】 There are usually two classes of outlier detection algorithms. One is usually applied to statistical database,and takes all attributes as mult|dimensional space,while not distinguishes between geo-spatial dimensionality and non-spatial dimensionality in detecting process; the other is usually applied to spatial databases,which distinguishes between geo-spatial dimensionality and non-spatial dimensionality. Most of the existing approaches belong to the first category. If the approaches are used directly,meaningless or incorrect outliers could be found in geo-spatial databases. The existing approaches in the second category have poor efficiency or can’t detect local outliers. To overcome these shortcomings,a new spatial outlier factor based on neighborhood is proposed to detect outliers in spatial databases. The proposed algorithm is spatial|database-oriented and is supposed to be able to find out both local and global outliers. Theoretical analysis shows that the algorithm is reasonable. The experimental results show that the approach is practical on synthetic and real data set.
【Key words】 spatial databases; spatial data mining; spatial outliers; spatial neighborhood;
- 【文献出处】 中国图象图形学报 ,Journal of Image and Graphics , 编辑部邮箱 ,2006年07期
- 【分类号】TP311.13
- 【被引频次】13
- 【下载频次】252