Abstract: Although the deep learning method has made a huge breakthrough in machine learning, it requires a large amount of manual work for data annotation. Limited by labor costs, however, many applications are expected to reason and judge the instance labels that have never been encountered before. For this reason, zero-shot learning (ZSL) came into being. As a natural data structure that represents the connection between things, the graph is currently drawing more and more attention in ZSL. Therefore, this study reviews the methods of graph-based ZSL systematically. Firstly, the definitions of ZSL and graph learning are outlined, and the ideas of existing solutions for ZSL are summarized. Secondly, the current ZSL methods are classified according to different utilization ways of graphs. Thirdly, the evaluation criteria and datasets concerning graph-based ZSL are discussed. Finally, this study also specifies the problems to be solved in further research on graph-based ZSL and predicts the possible directions of its future development.
Abstract: Operating systems play an important role in modern life. To serve different hardware environments and diverse application scenarios, they need to be scalable and flexible while maintaining good performance. A multicore operating system, as a kind of distributed operating system, is one of the solutions to this problem. This study analyzes the design principles of multicore operating systems, investigates existing multicore operating system technologies, and compares these technologies with other relevant ones. Finally, the current situation and trend of the research on multicore operating systems are summarized.
Abstract: Leaf vein segmentation is an important step in leaf pattern analysis, which is of great significance for soybean variety identification and phenotype research. On account of the complicated vein structure of soybean leaves and the low contrast of the leaf area where the veins are located, it is generally impossible to achieve ideal leaf vein segmentation results only using gray information. This study presents a soybean vein segmentation method combining the multi-scale gray unconstrained hit-or-miss transform (UHMT) algorithm and the processing method based on the hue data of HSI color space. In this method, the gray information in RGB color space and the hue data in HSI color space are used to segment the global leaf veins and local primary and secondary veins from soybean leaf images, respectively. The former uses iterative threshold segmentation to extract the leaf area and eliminates interference factors such as the outer contour and the petiole through expansion and corrosion to obtain the leaf area image. Then, the multi-scale gray UHMT algorithm is employed to obtain the global leaf vein image. Considering the poor performance of primary and secondary vein segmentation, we use hue data to enlarge the discrepancies in gray values between veins pixels and other pixels to realize the segmentation of local primary and secondary veins. The obtained global and local vein images are fused into the final soybean leaf vein image. Moreover, this study utilizes soybean leaf images in the soybean leaf image database, SoyCultivar, to verify the effectiveness of the algorithm. The results indicate that this algorithm is better than existing leaf vein segmentation methods as it can not only extract soybean leaf veins completely but also well eliminate the background, leaf contours, petioles, and other irrelevant components.
Abstract: Single object tracking is a research focus in the field of computer vision. Traditional algorithms including correlation filtering have fast tracking speed but generally low tracking accuracy due to the roughness of extracted manual features such as color and gray levels. With the development of deep learning theory in recent years, tracking methods using deep features can achieve a good balance between tracking accuracy and speed. This study first introduces the relevant background of single object tracking and then sorts out multiple algorithms that have emerged in the development of single object tracking from the two stages of single object tracking based on correlation filters and deep learning. The current mainstream Siamese network algorithms are also introduced in detail. Finally, a large data set is used to compare and analyze the excellent algorithms that have emerged in recent years. In view of the shortcomings and deficiencies of these algorithms, the development prospects of this field are provided in this study.
Abstract: Pattern recognition of electroencephalogram (EEG) signals during motor imagery (MI) has been one of the most important research directions in the field of non-invasive brain-computer interface (BCI). In recent years, deep learning has further improved the recognition accuracy of EEG signals during MI. However, given the strong time variability of EEG signals, there are still some problems such as insufficient training samples and too high feature dimensions. To solve the above problems, this study proposes a new training strategy called “overlapped time slice”. Based on the existing cropped time slice strategy, this study adopts a novel overlapped time slice strategy and constructs a new loss function calculation and label prediction method with the overlapped time slice set. The overlapped time slice strategy can not only further increase the number of training samples but also reduce the feature space of a single sample to improve the performance of the deep neural network in EEG signal recognition. For the verification of the feasibility and effectiveness of the proposed overlapped strategy, three open-source EEG signal datasets, namely the BCI Competition IV datasets 1, 2a, and 2b, are selected in this study, and five kinds of deep neural network models are built on these three datasets. During experiments, the performance and efficiency of MI recognition are compared between the cropped strategy and the overlapped strategy. Experimental results show that the overlapped strategy has better recognition performance than that of the cropped strategy. Finally, nine groups of experiments are designed with different parameter combinations by adjusting the parameters of the overlapped time slice strategy. The experimental results demonstrate that parameter combination affects the final classification performance and that the classification performance is not in a linear relationship with the efficiency. The recognition accuracy of the proposed overlapped strategy on dataset 1, 2a, and 2b is 92.3%, 77.8%, and 86.3% respectively. Compared with the conventional cropped strategy, the proposed overlapped strategy has improved the performance significantly without necessarily reducing the efficiency.
Abstract: With the development of science and technology, the deployment of large-scale quantum computers is becoming possible, and the public-key cryptographic algorithms based on some difficult problems will be solved by quantum algorithms effectively. The security of traditional trusted hardware chips such as TCM/TPM will be seriously affected due to the wide use of public-key cryptosystems such as RSA, SM3, and ECC, and most of the quantum-resistant (QR) cryptographic algorithms cannot be implemented on hardware chips with limited computational resources. Therefore, it is necessary to redesign the QR trusted computing platform. In this study, considering the security challenges faced by trusted computing in quantum computing models, we summarize the present situation of QR trusted computing research and propose a QR trusted computing technology system. Combined with the existing post-quantum cryptographic protocol and trusted computing software and hardware technology framework, we transplant the QR cryptographic algorithms and protocol on the trusted computing platform and implement a prototype system of a QR trusted computing security support platform based on TCM. The work includes the design of the primitive root key and QR extensions such as TCM cipher library, remote attestation, and LDAA. Finally, the results of function and performance tests on the emulator for the above TCM modules show that the prototype system is resistant to attacks by quantum algorithms, with acceptable application performance overhead.
Abstract: Text matching is a core research area in natural language processing. Deep text matching models can be broadly classified into representational models and interactive models. The former tends to lose semantic focus and fails to measure the contextual importance of words. The latter lacks global information such as sentence type and inter-sentence information. To address these problems, we propose a text matching model incorporating multi-angle features based on Siamese neural network. The model generates word vectors using the BERT model and enhances semantic features by the similarity fusion of words. It then encodes the syntactic structured features using Bi-LSTM, namely the syntactic structured information containing the lexical sequence. A Transformer encoder is utilized to realize the multi-level interaction between the syntactic structured features and the text features. Finally, the similarity is deduced by spliced vectors. Experiments on part of Quora question pair show that this model performs better than the classical deep matching model.
Abstract: Automatic recognition of crop leaf diseases is an important application of computer vision technology in agriculture. In recent years, deep learning methods have made some progress in the recognition of crop leaf diseases, and they are all based on deep feature representations of a single deep convolutional neural network (CNN) model. However, the useful fact that the image representation ability of different deep CNN models is complementary has not received attention for research. Thus, this study proposes a network model MDFF-Net for fusing different deep features. MDFF-Net connects two pre-trained deep CNN models in parallel and then sets a fully connected layer with the same number of neurons for each model to transform the deep features output by different models into features with the same dimension. Then, through the non-linear transform of two fully connected layers, the effect of feature fusion is further improved. We choose VGG-16 and ResNet-50 as the feature extractors of MDFF-Net and conduct experiments on a public dataset containing five apple leaf diseases. The experimental results show that the recognition accuracy of MDFF-Net is 96.59%, which is better than the results achieved by VGG-16 or ResNet-50 alone and thus proves the effectiveness of the deep feature fusion method.
Abstract: In the process of sinter production, the pores formed in the sinter are an important parameter for sinter quality evaluation. Due to the different shapes of sinter pores and their fuzzy edge, the error rates of pores segmented are large. In order that the pores can be segmented more accurately, the sinter image is preprocessed by OpenCV. Through a comparative analysis of the traditional image segmentation algorithm, this study proposes an algorithm based on an improved UNet network to segment the preprocessed sinter pore image. An improved module integrating residual and splicing is introduced into UNet network coding to obtain more information on pore features. The experimental results show that the improved algorithm is better than both the traditional UNet network and the traditional image segmentation algorithm in mean intersection over union (MIoU) and the Dice index.
Abstract: For finding the ophthalmic diseases that can be observed from retinal vessels, fundus images play a key role and provide an effective reference for professional medical personnel. However, manual vessel segmentation has a large workload, which is time-consuming and laborious. Therefore, developing an automatic and intelligent segmentation method is of great benefit to relevant personnel. In this study, the attention mechanism and RU-Net structure are integrated into the generator of generative adversarial networks (GANs), forming a new structure—Retina-GAN. At the same time, automatic color equalization (ACE) is selected in the preprocessing of fundus images to improve image contrast and make blood vessels clearer. To validate the proposed approach, we compared the Retina-GAN with some other models on DRIVE datasets. Accuracy, sensitivity, and specificity are measured for comparative analysis. The experiment shows that Retina-GAN has better performance than other models.
Abstract: The video magnification technology gives people an opportunity to observe and study small changes in things. A video is decomposed by the complex steerable pyramid and the motion in the video is manipulated by analyzing the phase differences at different scales and in different directions. Magnified signals at some scales in the complex steerable pyramid exceed their phase shift limits, resulting in artifacts and blurring. This problem can be solved by adjusting the magnification factor for each scale. In this paper, a video magnification algorithm based on multi-scale filtering is proposed. A relation between the spatial wavelength of a video frame image and vibration displacement is established to determine the upper limit of the magnification factor for each scale. The manual setting of a cut-off wavelength is no longer necessary, and the amplification factor for each scale in the complex steerable pyramid is adjusted automatically so that magnified signals can adapt to their phase shift limits. The vibration attached to the large motion is magnified to verify the performance of the proposed algorithm, and the result shows that it has obvious advantages over the existing video amplification algorithms.
Abstract: Compared with other fine-grained image classifications, that of wild snakes is more difficult and complicated, as it is difficult to judge and classify snakes by their local characteristics due to their different postures, rapid posture changes, and usual status of motion or coiling. In response, this paper applies the self-attention mechanism to fine-grained wild snake image classification to solve the problem that the convolutional neural network focuses too much on the local parts to ignore the global information due to the increasing number of layers. Transfer learning is implemented through Swin Transformer (Swin-T) to obtain a fine-grained feature extraction model. To further study the performance of the self-attention mechanism in meta-learning, this paper improves the feature extraction model, builds a Siamese network, and construct a meta-learner to learn and classify a small number of samples. Compared with other methods, the proposed method reduces the time and space consumption caused by feature extraction, improves the accuracy and efficiency of meta-learning classification, and increases the learning autonomy of meta-learning.
Abstract: To address the cold-start and sparsity problems of recommendation systems, this paper proposes a recommendation model based on a heterogeneous information network. Previous approaches were unable to take into account both knowledge graph representation learning and implicit path information, which made the performance of knowledge recommendation systems mediocre. The proposed method sets meta-paths in the heterogeneous information network and integrates them into knowledge graph representation learning by the graph neural network (GNN). Next, the attention network is used to connect a recommendation task with a knowledge graph representation task. It can not only learn the potential features of the two tasks but also enhance the interactions between the recommended items in the recommendation system and the entities in the knowledge graph. Finally, the user click rate is predicted in the recommendation task. The method is experimented on the open dataset Book-Crossing and the knowledge graph constructed with the DBLP dataset, and the results demonstrate that the proposed model achieves better performance than that of other algorithms in indexes of area under curve (AUC), recall, and F1-score.
Abstract: YOLOx-Darknet53 is an improved detection network integrating a basis of you only look once version 3 (YOLOv3) with various tricks added. Nevertheless, it still uses Darknet53 as the backbone network to extract features, so the feature extraction capability of the network is still insufficient. In this paper, we acquire a contextual attention (CoA) module by improving the attention mechanism in CoTNet and replace the 3×3 convolution in the residual block of the YOLOx backbone network with the module to obtain a new residual block after attention fusion and thereby strengthen the feature extraction capability of the backbone network. A comparison experiment is conducted on the Pascal VOC2007 data set. The mean average precision AP@[.5:.95] and the AP@0.5 of the network integrating the CoA module are both 1.4 higher than those of the original network. After the backbone network is improved, a non-parameter 3D attention module is added in front of the YOLOx detection head to obtain the final improved detection network. The results of another round of the above comparative experiment show that the AP@[.5:.95] and the AP@0.5 of the final network are respectively 1.6 and 1.5 higher than those of the original network. Therefore, the improved network is more accurate than the original network in detection and can achieve better detection effects in industrial applications.
Abstract: This study designs and develops a representation and search method and a relevant system for dynamic linked data networks. The method can help users obtain the association around an entity when there are complicated entities and associations, and the network can be dynamically expanded through guided interaction. When users know multiple entities which are suspected to be associated, the minimum connected graph algorithm based on distributed computing is used to search out the association network. Application cases show that the proposed method and system can achieve good results and user experiences, and the system based on the method has been applied in many engineering projects such as Smart City, Safe City and Metropolitan IoT.
Abstract: To accurately classify Sina microblog comment information, this study proposes an improved genetic algorithm-improved particle swarm optimization-balanced support vector machine (GA-IPSO-BSVM) classification model to enhance the accuracy and convergence of classifying Sina microblog comment information. Firstly, to effectively improve the algorithm convergence speed and efficiently save computational resources, this model introduces the elimination mechanism of the GA in the early iteration to remove a large number of low-speed particles. Secondly, to avoid the algorithm being trapped in local optima and improve the topology of particle relations in PSO, this study utilizes a K-means clustering algorithm to perform cluster partition of particle swarms in the middle of the iteration. The particle swarms are iterated in the communities and excellent particles are selected in each community. Thirdly, all excellent particles in the communities are combined into an excellent particle swarm that is iterated to derive the global optimal solution in the late iteration. Fourthly, the hyperparameter optimization of BSVM is performed by combining GA with IPSO to enhance classification accuracy. Finally, the proposed GA-IPSO-BSVM model is used for verifying the classification and prediction of Sina microblog comment information. The experimental results demonstrate the superiority of our proposed classification model over other benchmark models applied to Sina microblog comment information classification in terms of accuracy improvement.
Abstract: The security of electric energy plays an important role in national security. With the development of power 5G communication, a large number of power terminals have positioning demand. The traditional global positioning system (GPS) is vulnerable to spoofing. How to improve the security of GPS effectively has become an urgent problem. This study proposes a GPS spoofing detection algorithm with base station assistance in power 5G terminals. It uses the base station positioning with high security to verify the GPS positioning that may be spoofed and introduces the consistency factor (CF) to measure the consistency between GPS positioning and base station positioning. If CF is greater than a threshold, the GPS positioning is classified as spoofed. Otherwise, it is judged as normal. The experimental results show that the accuracy of the algorithm is 99.98%, higher than that of traditional classification algorithms based on machine learning. In addition, our scheme is also faster than those algorithms.
Abstract: The pupil distance of human eyes is a basic parameter which needs to be measured accurately in optometry research and glasses prescription. It is of great application value to realize automatic pupil distance detection. With the help of a Gaussian skin color model, this study proposes a method of pupil location and pupil distance calculation based on gray integral projection and Hough circle transform. Firstly, the image is subjected to the preprocessing of illumination compensation by the adaptive brightness correction method of a two-dimensional gamma function. On this basis, the face area is extracted by using a skin color probability model. Then, the gray integral projection method is used to extract the eye area by selecting an appropriate threshold. After morphological image processing, the Canny edge detection operator is utilized to detect the pupil edge. Finally, the pupil location is carried out by Hough circle transform, and the center distance is calculated. The results show that the algorithm has a fast pupil location speed and can realize the automatic pupil distance measurement more accurately, which lays the foundation for the research on intelligent technologies of pupil distance measurement.
Abstract: Ensemble learning has been widely used for improving classification accuracy. Recent studies show that building ensemble classifiers through a multi-modal perturbation strategy can further improve classification performance. In this study, we propose an ensemble pruning algorithm based on approximate reducts and optimal sampling (EPA_AO). In EPA_AO, we design the multi-modal perturbation strategy to build different individual classifiers. The proposed perturbation strategy can simultaneously perturb the attribute space and training set, which can improve the diversity of individual classifiers. We use the evidential K-nearest neighbor (KNN) algorithm to train individual classifiers and compare EPA_AO with existing algorithms of the same type on multiple UCI data sets. Experimental results show that EPA_AO is an effective ensemble learning approach.
Abstract: On a construction site, safety helmets can reduce head injuries, and safety helmets of different colors represent different identities. The contemporary method of detecting safety helmet wearing and identifying types of work by video surveillance is time-consuming, incomplete, and low in supervision efficiency. In response, this study proposes an improved method of safety helmet wearing detection and identity recognition based on the you only look once version 4 (YOLOv4). On the basis of the original YOLOv4, the K-means algorithm is used to cluster the size of the prior box again, and multi-scale prediction output is added. The experimental distance intersection over union–non-maximum suppression (DIoU–NMS) is used for NMS so that safety helmet wearing detection and identity recognition of workers can achieve high efficiency and comprehensiveness. The results show that the average detection accuracy among workers wearing red, blue, yellow, and white safety helmets and workers without safety helmets is 92.1%, which means the proposed method ensures the real-time monitoring of the safety helmet wearing of workers on the construction site.
Abstract: To improve the robustness of the digital watermarking algorithm, this study proposes a zero-watermarking algorithm for three-dimensional (3D) mesh based on the vector length ratio. First, the 3D mesh model is converted from a rectangular coordinate system to a spherical coordinate system. Second, we divide the vertices of the mesh model into blocks and calculate the centroid of each block. Then, we calculate the ratio of the distance between each vertex and the centroid of the block to the average distance between its neighbor vertex and the centroid and compare the ratio sequence with the average of the ratios. We record the comparison results as 0 or 1 according to the size relation and perform statistics of the majority of the numbers of 0 and 1 in each block. In this way, the characteristic sequence of the model can be constituted. Finally, the binary sequence of the copyright information and the characteristic sequence are XORed to obtain the final zero-watermark sequence. Experimental results show that the algorithm in this study can resist common attacks such as translation, rotation, uniform scaling, noise, smoothing, simplification, and cropping, and can protect the copyright of the 3D mesh model.
Abstract: In recent years, artificial intelligence has been widely used in various fields. To address time-consuming manual weighing and complicated pricing procedures in supermarkets and vegetable markets, this study proposes an automatic fruit recognition model based on attention YOLOv5. First, to improve the recognition accuracy of fruits with different local features but similar global features, the study adds squeeze-and-excitation networks (SENet) after the spatial pyramid pooling (SPP) layer of YOLOv5 and uses the attention mechanism to automatically learn the importance of each feature channel. Further, the useful features for fruit recognition tasks according to the importance are strengthened and those useless are suppressed. Second, when the fruit recognition prediction frame overlaps the target frame, GIOU cannot accurately express the overlapping relationship of the frames. In response, this study replaces the original frame regression loss function GIOU with CIOU and considers the relationships of aspect ratio and center point between the target frame and the prediction frame. In this way, the fruit prediction frame is closer to the real frame, and thereby the prediction accuracy is improved. Experimental results show that the improved model has significantly improved fruit recognition ability in common scenarios with a mean average precision (mAP) of 99.10% and a recognition speed of 82 FPS, which can meet the needs of practical applications.
Abstract: There is a large amount of failure information from the engine after-sales maintenance and failure reports. This study introduces knowledge graphs and designs a systematic building procedure for the field of engine fault. It carries out ontology modeling for the multi-source fault data. The entity recognition framework that combines BERT with BiLSTM-CRF is used to mine expert knowledge in fault data. The index FF-IEF (fault frequency-inverse event frequency) is proposed, and fault diagnosis is performed based on the knowledge graph and Bayesian network. We design and develop the prototype system EFKG that contains 12534 entities and 408972 triplets. The system provides knowledge extraction, visual retrieval, and auxiliary decision-making. It can effectively improve the efficiency of information retrieval and maintenance and is of guiding significance for the application of knowledge graphs in the field of engine fault.
Abstract: In recent years, the application of scene graphs in image captioning has been increasingly researched. However, the current image captioning models based on scene graphs cannot take into account the previous input retained in long short-term memory (LSTM) networks, which may lead to missed information. In this study, we firstly propose the image captioning network based on original information injection, which keeps the original input information as much as possible and reduces the missed information. Secondly, we consider that the degree of the current graph updating mechanism is too large, which may lead to the missing of node information. Thus, we propose a visit control module to update the weights of visited nodes, avoiding such missing. Finally, we design a graph update factor (GUF) to determine the update level. We conduct experiments on the official dataset: MSCOCO. The mechanism evaluation shows that our model has achieved more competitive results compared with the baseline model.
Abstract: Given the low recognition rate and the difficulty in detecting small cracks in the asphalt pavement under complex background, the crack detection method based on improved Faster-RCNN is proposed. First, the road surface images are collected by the multifunctional road detection vehicle, and 13 000 pictures are divided into training sets and test sets at a ratio of 8:2. Then VGG16, MobileNet-V2, and ResNet50 networks are utilized to replace the feature extraction network in the Faster-RCNN model to identify the cracks. The results show that the combination of ResNet50 and Faster-RCNN can achieve the best result with an accuracy of 0.805 8. The cracks are distributed on the same level without hierarchical information. Therefore, other ResNet networks are expected to work better with the Faster-RCNN model. However, it turns out that ResNet50 still outperforms ResNet18 and ResNet101. In the case of missed detection of small cracks, the convolutional block attention module (CBAM) module is also introduced into ResNet50 and the influence of different insertion positions on the detection accuracy is compared. Experiments show that the improved Faster-RCNN model has a detection accuracy of 85.64%, which can effectively detect small cracks under complex backgrounds.
Abstract: To address the low signal-to-noise ratio of infrared images and their vulnerability to the impact of the background environment, this study proposes an image enhancement algorithm based on non-separable wavelet based multiscale directional analysis (NSWMDA) and linking synaptic computation network (LSCN). Firstly, the original image is subjected to non-separable wavelet transform (NSWT) with redundant lifting, which yields a high-frequency detail subband and a low-frequency approximation subband. Then, the high-frequency detail subband is filtered in multi-direction before its enhancement by the LSCN algorithm, while the low-frequency approximation subband is directly enhanced by the LSCN algorithm. Finally, the processed sub-images are fused and reconstructed to constitute the enhanced infrared image. In the infrared image of a power transformer, the edge strength, information entropy, peak signal-to-noise ratio, structural similarity and average gradient of the proposed algorithm are at least 10.86%, 14.39%, 19.95%, 7.06% and 6.70% higher than those of other algorithms. The experimental results show that the algorithm not only improves the overall clarity of the infrared image but also strengthens the detail texture and contrast of the image. It has a good infrared image enhancement effect and thus a bright application prospect for power equipment detection.
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.