Image Caption Algorithm Based on ViLBERT and BiLSTM

doi:10.15888/j.cnki.csa.008133

WeChat

Mobile website

Home > Archive>Volume 30, Issue 11, 2021 >195-202. DOI:10.15888/j.cnki.csa.008133

PDF HTML XML Export Cite reminder

Image Caption Algorithm Based on ViLBERT and BiLSTM
DOI:
                        10.15888/j.cnki.csa.008133
                    
Author:
                        
                        
                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Traditional image captioning has the problems of the under-utilization of extracted image features, the lack of context information learning and too many training parameters. This study proposes an image captioning algorithm based on Vision-and-Language BERT (ViLBERT) and Bidirectional Long Short-Term Memory network (BiLSTM). The ViLBERT model is used as an encoder, which can combine image features and descriptive text information through the co-attention mechanism and output the joint feature vector of image and text. The decoder uses a BiLSTM combined with attention mechanism to generate image caption. The algorithm is trained and tested on MSCOCO2014, and the scores of evaluation criteria BLEU-4 and BLEU are 36.9 and 125.2 respectively. This indicates that the proposed algorithm is better than the image captioning based on the traditional image feature extraction combined with the attention mechanism. The comparison of generated text descriptions demonstrates that the image caption generated by this algorithm can describe the image information in more detail.

Reference

Cited by

Get Citation

许昊,张凯,田英杰,种法广,王子超.基于ViLBERT与BiLSTM的图像描述算法.计算机系统应用,2021,30(11):195-202

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:December 29,2020
Revised:February 03,2021
Adopted:
Online: October 22,2021
Published:

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

WeChat

Mobile website

Get Citation

Share

Article Metrics

History