###
计算机系统应用英文版:2018,27(8):164-169
本文二维码信息
码上扫一扫!
基于深度学习的网站权威性预测
(1.中国科学院 计算机网络信息中心, 北京 100190;2.中国科学院大学, 北京 100049)
Website Authority Prediction Based on Deep Learning
(1.Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China;2.University of Chinese Academy of Sciences, Beijing 100049, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 1848次   下载 1659
Received:December 18, 2017    Revised:January 04, 2018
中文摘要: 网站权威性一般是由外部链接来衡量,高质量的外部链接越多,网站的权威性就越高;常用的评价网站权威性的算法有PageRank等,然而该类算法对网站权威性的影响是有选择性的,使得这种方法具有一定的弊端.本文利用深度学习的方法,通过将搜索词和网址映射为向量,计算两个向量之间的相似度来评判在某个搜索词下不同网址的权威性,把计算结果相似度高对应的网站称为在该搜索词下权威性高的网站,从而从另一种角度去衡量网站的权威性.通过对比使用Word2vec和LSTM两种不同的模型实验,在对公开的数据集上的实验结果表明使用这两种模型是有效的,其中LSTM模型比Word2vec模型的效果要好.
中文关键词: 网站权威性  Word2vec  LSTM  自然语言处理
Abstract:Website authority is generally measured by external links. The more high-quality external links are, the more authoritative the website or web page itself is. Evaluation website authoritative algorithm has PageRank and so on. However, the impact of such algorithms on the authority of the website is selective, making this method has some drawbacks. This study uses the method of deep learning, by mapping search terms and URLs into vectors, and then calculates the similarity between two vectors to judge the authority of different websites under a certain search term. The website with high similarity of calculation results is referred to as an authoritative site under the search term, so we can use another view to measure the authority of website. By comparing two different model experiments using Word2vec and LSTM, the experimental results on open datasets show that it is effective to use both models, and LSTM model is better than Word2vec model.
文章编号:     中图分类号:    文献标志码:
基金项目:国家重点研发计划(2017YFB0203704)
引用文本:
杨海华,冯仰德,王珏,聂宁明,刘芳,张博尧.基于深度学习的网站权威性预测.计算机系统应用,2018,27(8):164-169
YANG Hai-Hua,FENG Yang-De,WANG Jue,NIE Ning-Ming,LIU Fang,ZHANG Bo-Yao.Website Authority Prediction Based on Deep Learning.COMPUTER SYSTEMS APPLICATIONS,2018,27(8):164-169