本文已被:浏览 1895次 下载 2338次
Received:April 07, 2015 Revised:May 12, 2015
Received:April 07, 2015 Revised:May 12, 2015
中文摘要: 由于数据规模的快速增长,高效用序列模式挖掘算法效率严重下降.针对这种情况,提出基于MapReduce的高效用序列模式挖掘算法HusMaR.算法基于MapReduce框架,使用效用矩阵高效地生成候选项;使用随机映射策略均衡计算资源;使用基于领域的剪枝策略来防止组合爆炸.实验结果表明,在大规模数据集下,算法取得了较高的并行效率.
Abstract:Because of the rapid growth of data, the high utility sequential pattern mining algorithms' efficiency decreases seriously. In view of this, we propose a high utility sequential pattern mining algorithm based on MapReduce, namely HusMaR. This algorithm is based on MapReduce, which using the utility matrix to generate candidate efficiently, random mapping strategy to balance of computing resources and field-based pruning strategy to prevent an explosion. Experimental results show that in the large scale of data, the algorithm achieves a high parallel efficiency.
keywords: sequential pattern MapReduce pruning strategy high utility sequential pattern mining random strategy
文章编号: 中图分类号: 文献标志码:
基金项目:
引用文本:
程思远,马超,李聪聪.基于MapReduce的高效用序列模式挖掘算法.计算机系统应用,2015,24(12):228-232
CHENG Si-Yuan,MA Chao,LI Cong-Cong.High Utility Sequential Pattern Mining Algorithm Based on MapReduce.COMPUTER SYSTEMS APPLICATIONS,2015,24(12):228-232
程思远,马超,李聪聪.基于MapReduce的高效用序列模式挖掘算法.计算机系统应用,2015,24(12):228-232
CHENG Si-Yuan,MA Chao,LI Cong-Cong.High Utility Sequential Pattern Mining Algorithm Based on MapReduce.COMPUTER SYSTEMS APPLICATIONS,2015,24(12):228-232