基于层级注意力模型的无监督文档表示学习

doi:10.15888/j.cnki.csa.006533

微信公众号

网站二维码

首页 > 过刊浏览>2018年第27卷第9期 >40-46. DOI:10.15888/j.cnki.csa.006533

PDF HTML阅读 XML下载导出引用引用提醒

基于层级注意力模型的无监督文档表示学习增强出版
DOI:
                        10.15888/j.cnki.csa.006533
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金（61673364）

Unsupervised Document Representation Learning Based on Hierarchical Attention Model

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

增强出版

文章评论

摘要:

许多自然语言应用需要将输入的文本表示成一个固定长度的向量，现有的技术如词嵌入（Word Embeddings）和文档表示（Document Representation）为自然语言任务提供特征表示，但是它们没有考虑句子中每个单词的重要性差别，同时也忽略一个句子在一篇文档中的重要性差别.本文提出一个基于层级注意力机制的文档表示模型（HADR），而且考虑文档中重要的句子和句子中重要的单词因素.实验结果表明，在考虑了单词的重要和句子重要性的文档表示具有更好的性能.该模型在文档（IMBD）的情感分类上的正确率高于Doc2Vec和Word2Vec模型.

Abstract:

Many natural language applications need to represent the input text into a fixed-length vector. Existing technologies such as word embeddings and document representation provide natural representation for natural language tasks, but they do not consider the importance of each word in the sentence, and also ignore the significance of a sentence in a document. This study proposes a Document Representation model based on a Hierarchical Attention (HADR) mechanism, taking into account important sentences in document and important words in sentence. Experimental results show that documents that take into account the importance of words and importance of sentences have better performance. The accuracy of this model in the sentiment classification of documents (IMBD) is higher than that of Doc2Vec and Word2Vec models.

参考文献

相似文献

引证文献

引用本文

欧阳文俊,徐林莉.基于层级注意力模型的无监督文档表示学习.计算机系统应用,2018,27(9):40-46

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2018-01-17
最后修改日期:2018-02-09
录用日期:
在线发布日期: 2018-07-26
出版日期:

微信公众号

网站二维码

引用本文

分享

文章指标

历史