###
计算机系统应用英文版:2023,32(10):147-156
本文二维码信息
码上扫一扫!
基于共享近邻和优化关联策略的边界剥离聚类
(1.中国科学技术大学 大数据学院, 合肥 230026;2.中国科学技术大学 数学科学学院, 合肥 230026;3.中国科学院吴文俊数学重点实验室(中国科学技术大学), 合肥 230026)
Border Peeling Clustering Based on Shared Nearest Neighbors and Optimized Association Strategy
(1.School of Data Science, University of Science and Technology of China, Hefei 230026, China;2.School of Mathematical Sciences, University of Science and Technology of China, Hefei 230026, China;3.CAS Key Laboratory of Wu Wen-Tsun Mathematics (University of Science and Technology of China), Hefei 230026, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 380次   下载 632
Received:March 21, 2023    Revised:April 20, 2023
中文摘要: 边界剥离聚类算法(BP)是一种基于密度的聚类算法, 它通过逐渐剥离边界点来揭示聚类的潜在核心, 已经被证明是一种十分有效的聚类手段. 然而, BP算法仍存在一些不足之处: 一方面, 数据点的局部密度仅考虑了距离特征, 使得边界点的确定不够合理; 另一方面, BP算法中的关联策略容易误判异常值, 并且在分配边界点时容易产生连带错误. 为此, 本文提出了一种基于共享近邻和优化关联策略的边界剥离聚类算法(SOBP). 该算法使用了基于共享近邻的局部密度函数来更好地探索数据点之间的相似性, 同时优化了BP算法中的关联策略, 使得每次迭代中边界点不再仅与一个非边界点进行关联, 并进一步采用了边界点与非边界点、已剥离边界点之间的双重关联准则. 在一些数据集上的测试表明, 相较于其他6种经典算法, 该算法在评估指标上表现更佳.
Abstract:The border peeling (BP) clustering algorithm is a density-based clustering algorithm. It gradually peels up border points to reveal the potential cores of clusters and has been proven to be an effective clustering algorithm. However, the BP algorithm has some limitations. On the one hand, the local density of data points only considers the distance characteristics, which can lead to the unreasonable determination of border points. On the other hand, the association strategy of the BP algorithm is prone to misjudge outliers and can generate associated errors when border points are allocated. Hence, this study proposes a BP clustering algorithm based on shared nearest neighbors and optimized association strategy (SOBP). The algorithm employs a local density function based on shared nearest neighbors to better explore the similarity between data points. Meanwhile, the association strategy of the BP clustering algorithm is optimized so that in each iteration, border points are no longer associated with only one non-border point. Furthermore, a double association criterion between border points and non-border points as well as between border points peeled up is utilized. Tests on several datasets show that the proposed algorithm outperforms six other classical algorithms in terms of evaluation indexes.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(12071453); 量子通信与量子计算机重大项目(2021ZD0302904)
引用文本:
冯洁净,侯新民.基于共享近邻和优化关联策略的边界剥离聚类.计算机系统应用,2023,32(10):147-156
FENG Jie-Jing,HOU Xin-Min.Border Peeling Clustering Based on Shared Nearest Neighbors and Optimized Association Strategy.COMPUTER SYSTEMS APPLICATIONS,2023,32(10):147-156