###
DOI:
计算机系统应用英文版:2010,19(3):107-110
本文二维码信息
码上扫一扫!
基于主题短语的搜索引擎结果聚类
(中国石油大学(华东) 计算机与通信工程学院 山东 东营 257061)
Subject Phrase-Based Clustering Algorithm for Search Engine Results
摘要
图/表
参考文献
相似文献
本文已被:浏览 1787次   下载 2917
Received:June 25, 2009    
中文摘要: 为了解决搜索引擎检索结果中的主题混杂现象,帮助用户快速准确地定位到有价值的信息,提出基于主题短语的搜索引擎结果聚类方法。首先从检索结果中提取查询词并与相邻词语组成主题短语,建立包含高频独立词语及主题短语的混合向量空间模型,同时引入同义词词林对特征项进行语义扩充,最后采用改进的k-means聚类算法对搜索结果进行聚类,并为各个类别提取类别标签。实验结果表明,该算法能有效提高聚类结果的准确率。
Abstract:To solve the problem of mixed subjects returned by search engine results, a new subject phrases clustering algorithm is presented to help locate the valuable results that the users really need. The algorithm firstly extractes some subject phrases from the search results. Then, the vector space model is built. Finally, the results are clustered by the improved k-means algorithm. The algorithm was tested and validated by the experiments.
文章编号:     中图分类号:    文献标志码:
基金项目:
引用文本:
索红光,孙珊珊,王玉伟,梁玉环.基于主题短语的搜索引擎结果聚类.计算机系统应用,2010,19(3):107-110
SUO Hong-Guang,SUN Shan-Shan,WANG Yu-Wei,LIANG Yu-Huan.Subject Phrase-Based Clustering Algorithm for Search Engine Results.COMPUTER SYSTEMS APPLICATIONS,2010,19(3):107-110