• Volume 31,Issue 7,2022 Table of Contents
    Select All
    Display Type: |
    • >Survey
    • Review on Small-scale Pedestrian Detection Technology for Complex Pavement

      2022, 31(7):1-11. DOI: 10.15888/j.cnki.csa.008545

      Abstract (829) HTML (2111) PDF 1.70 M (1596) Comment (0) Favorites

      Abstract:Pedestrian detection technology is an important research direction for the development of intelligent transportation and intelligent vehicles, and it is also an important guarantee for road safety, which directly affects the judgment of a vehicle control system on road conditions. In practical application scenarios, small-scale pedestrian instances account for a very high proportion, but small-scale pedestrian detection has always been a challenging problem in pedestrian detection tasks. When an intelligent vehicle is in a complex traffic environment, the precise detection of small-scale pedestrians can make the control system give a warning in advance and help avoid collision in time, which plays an important role in ensuring the safe and stable driving of the vehicle. With the rapid development of deep learning, groundbreaking progress has been made in the fast-growing small-scale pedestrian detection technology. To further promote the development of small-scale pedestrian detection technology, this study conducts comprehensive research on the latest methods of small-scale pedestrian detection technology. To start with, this study analyzes several challenges faced by small-scale pedestrian detection and classifies and summarizes the latest small-scale pedestrian detection networks. The existing deep learning methods are analyzed and discussed from five aspects, namely multi-scale representation, context information, new training and classification strategies, scale perception, and super-resolution. Among them, the multi-scale learning method is the mainstream of small-scale pedestrian detection. Meanwhile, we briefly introduce the commonly used evaluation indicators and datasets for pedestrian detection and evaluate the performance of some mainstream methods on general datasets such as Caltech. In addition, five methods are summarized and compared in this study. Finally, this study proposes the urgent problems to be solved in pedestrian detection technology and the direction and tasks of future development from multiple aspects.

    • Overview on Speech Synthesis, Forgery and Detection Technology

      2022, 31(7):12-22. DOI: 10.15888/j.cnki.csa.008641

      Abstract (1045) HTML (5511) PDF 1.59 M (3059) Comment (0) Favorites

      Abstract:In recent years, with the rise of mobile intelligent devices, people contact and use voice information more and more frequently. Voice forgery and its detection have become increasingly important technologies in the field of voice processing. Firstly, this study clarifies the general process of a voice generation system and systematically summarizes the two main technologies, text-to-speech (TTS) and voice conversion (VC), in the field of voice forgery. Then, the common algorithms in voice forgery detection technology are introduced and classified. Finally, to tackle the existing problems in voice forgery and its detection, this study puts forward possible development directions from the perspectives of data, models, training methods and application scenarios.

    • Survey on Single Image Super-resolution Reconstruction Based on Deep Learning

      2022, 31(7):23-34. DOI: 10.15888/j.cnki.csa.008566

      Abstract (814) HTML (2990) PDF 1.74 M (2161) Comment (0) Favorites

      Abstract:Image super-resolution reconstruction is an important technique to improve image quality. Thanks to the successful application and rapid development of deep learning in the field of computer vision, significant improvement in single image super-resolution (SISR) reconstruction has been achieved. In response, this study explores SISR reconstruction methods based on deep learning in depth. Relevant background knowledge such as benchmark data sets, performance evaluation indexes, and the loss function used in this field are outlined. Then, the latest algorithms for SISR reconstruction techniques with supervised and unsupervised learning are discussed respectively, and the differences and similarities among different models as well as their advantages and disadvantages are compared. Finally, the existing problems in this field are summarized, and future trends are proposed.

    • Survey on Deep Neural Network Image Target Detection Algorithms

      2022, 31(7):35-45. DOI: 10.15888/j.cnki.csa.008595

      Abstract (899) HTML (5666) PDF 1.32 M (2053) Comment (0) Favorites

      Abstract:With the exploration of the excellent feature extraction capabilities of deep convolutional neural networks, target detection has made a great stride. At the same time, the target detection technology combined with deep learning has achieved remarkable results. It has been widely used in such real scenarios as automatic driving, intelligent transportation systems, drone scenarios, military target detection, and medical navigation. The study reviews the shortcomings of traditional target detection algorithms and introduces commonly used detection data sets and performance evaluation indicators. It also summarizes classic target detection algorithms based on deep learning and elaborates on current target detection and existing difficulties and challenges. The feasible research directions in the future are prospected.

    • Review on Information Extraction Techniques for Knowledge Graph

      2022, 31(7):46-54. DOI: 10.15888/j.cnki.csa.008590

      Abstract (937) HTML (3624) PDF 1.21 M (1813) Comment (0) Favorites

      Abstract:How to extract useful information from surging data has become a critical issue confronting artificial intelligence in the Internet age. As an important method, knowledge graph has become the main driving force to promote the development of artificial intelligence technology. Information extraction realizes the extraction of structured entities and their relationships from massive data, which is the primary step in constructing a knowledge graph. This study discusses the development trend of information extraction in knowledge graphs, as well as entity extraction, relationship extraction, event extraction, and key technologies. Finally, it analyzes and discusses the current problems, challenges, and future development.

    • Hybrid Model for Time Series Prediction Based on Autoencoder

      2022, 31(7):55-65. DOI: 10.15888/j.cnki.csa.008581

      Abstract (689) HTML (1049) PDF 2.37 M (1202) Comment (0) Favorites

      Abstract:Time series prediction currently has a wide range of applications in many fields. It can help people make important decisions if they can accurately estimate the future development of events or indicators. However, modeling and accurately predicting time series with different features has become one of the most challenging applications. Therefore, a novel hybrid multi-step prediction model is proposed, called SSA-ConvBiAE. Firstly, the original data is decomposed into different trend components by singular spectrum analysis (SSA). Secondly, we design a new autoencoder network structure based on convolutional long short-term memory (ConvLSTM) and bidirectional gated recurrent unit (BiGRU). Finally, the different components are inputted to the corresponding autoencoders for training and prediction, and the prediction results are fused. To evaluate the predictive performance of our model, we conduct experiments on two real water supply datasets and two publicly available time series datasets. Experimental results show that the proposed model achieves better performance than baseline methods. The source code has been published on https://github.com/VIMLab-hfut/SSA-ConvBiAE.

    • Construction and Application of Knowledge Graph in Diesel Engine Fault Field

      2022, 31(7):66-76. DOI: 10.15888/j.cnki.csa.008592

      Abstract (1106) HTML (2190) PDF 1.56 M (6973) Comment (0) Favorites

      Abstract:There is a large amount of failure information from the engine after-sales maintenance and failure reports. This study introduces knowledge graphs and designs a systematic building procedure for the field of engine fault. It carries out ontology modeling for the multi-source fault data. The entity recognition framework that combines BERT with BiLSTM-CRF is used to mine expert knowledge in fault data. The index FF-IEF (fault frequency-inverse event frequency) is proposed, and fault diagnosis is performed based on the knowledge graph and Bayesian network. We design and develop the prototype system EFKG that contains 12534 entities and 408972 triplets. The system provides knowledge extraction, visual retrieval, and auxiliary decision-making. It can effectively improve the efficiency of information retrieval and maintenance and is of guiding significance for the application of knowledge graphs in the field of engine fault.

    • Multi-cluster Resource Monitoring Scheme for Kubernetes

      2022, 31(7):77-84. DOI: 10.15888/j.cnki.csa.008565

      Abstract (731) HTML (903) PDF 1.08 M (1509) Comment (0) Favorites

      Abstract:The increasing complexity and microservice of application systems promote the widespread use of containers. Enterprises often build multiple clusters for container arrangement and management and resource allocation with Kubernetes according to their business needs. To monitor the working status and resource usage by multiple clusters in real time, this study proposes a multi-cluster resource monitoring scheme for Kubernetes. The CPU, memory, network, and storage indicators provided by Kubernetes are collected, and part of the data collected are calculated according to their data type to obtain more intuitive monitoring indicators. As a result, multi-level and multi-type storage is achieved, and REST interfaces for monitoring data are provided. Experiments verify that this design consumes a small amount of cluster resources and achieves good performance.

    • Service-based Application Monitoring and Analysis System of Cloud Platform

      2022, 31(7):85-92. DOI: 10.15888/j.cnki.csa.008555

      Abstract (754) HTML (671) PDF 1.26 M (12770) Comment (0) Favorites

      Abstract:To address the problem that the methods of cloud platforms to monitor application performance have a poor ability to collect and analyze anomalies in the whole process, this study proposes an application anomaly detection and bottleneck identification system based on cloud platform service components (AAD-PSC) that can provide monitoring and analysis characterized by customizable indicator values of applications on a cloud platform with multi-tier architecture. For this purpose, this system collects service invocation data at the front-end application service layer and correlates them with anomaly events. Then, customized anomaly detection methods are determined for the applications to achieve the optimal detection results. Finally, performance anomalies caused by non-workload changes are identified, and bottleneck identification is conducted. Experimental results show that the proposed monitoring system is able to quickly and accurately detect different types of anomaly events and identify corresponding performance bottlenecks and meets the needs of a cloud platform in application performance monitoring.

    • Intelligent Flight Control Development System Based on Reinforcement Learning

      2022, 31(7):93-98. DOI: 10.15888/j.cnki.csa.008591

      Abstract (597) HTML (1677) PDF 2.06 M (1329) Comment (0) Favorites

      Abstract:The design and development of unmanned aerial vehicle (UAV) controllers are complex system engineering. The traditional development method based on code programming has the disadvantages of difficult development, long cycle, and high error rate. Although the intelligent flight control algorithm based on reinforcement learning has achieved good performance in simulation, it still lacks a complete development system in practice. This study presents a model-based development system for intelligent flight control, applying the reinforcement learning algorithm to the embedded development and deployment for flight control with modular programming and automatic code generation technologies. The system is equipped for the training simulation, testing, and hardware deployment of the reinforcement learning algorithm, and it is expected to improve the deployment speed of intelligent control algorithms represented by reinforcement learning and to reduce the development difficulty of intelligent flight control systems.

    • Data Exchange and Sharing Application for Overhaul Planning System Base on Dcloud

      2022, 31(7):99-105. DOI: 10.15888/j.cnki.csa.008579

      Abstract (451) HTML (676) PDF 1.23 M (985) Comment (0) Favorites

      Abstract:With the acceleration of constructing ultra-high voltage (UHV) power grids, the integrated operation characteristics of power grids become increasingly distinct. As a core link connecting several systems, the overhaul planning system witnesses greatly increased tasks. Accordingly, power grid operation requires the overhaul planning system to update its data exchange service and data sharing service. This study determines the Dcloud-based multi-level data sharing mechanism of the overhaul planning system. Further, it designs and integrates data exchange service, data sharing service, message pushing service, and data encryption and management service, forming a multi-level data sharing application. The practical application shows that the multi-level data sharing mode enhances the data interaction efficiency of overhaul plans, ensures data security, and improves data sharing.

    • Image Captioning Based on Visiting Control Module and Original Information Injection

      2022, 31(7):106-112. DOI: 10.15888/j.cnki.csa.008593

      Abstract (596) HTML (619) PDF 1.38 M (25669) Comment (0) Favorites

      Abstract:In recent years, the application of scene graphs in image captioning has been increasingly researched. However, the current image captioning models based on scene graphs cannot take into account the previous input retained in long short-term memory (LSTM) networks, which may lead to missed information. In this study, we firstly propose the image captioning network based on original information injection, which keeps the original input information as much as possible and reduces the missed information. Secondly, we consider that the degree of the current graph updating mechanism is too large, which may lead to the missing of node information. Thus, we propose a visit control module to update the weights of visited nodes, avoiding such missing. Finally, we design a graph update factor (GUF) to determine the update level. We conduct experiments on the official dataset: MSCOCO. The mechanism evaluation shows that our model has achieved more competitive results compared with the baseline model.

    • Inexact Exception Handling in Shenwei 1621 Math Library

      2022, 31(7):113-119. DOI: 10.15888/j.cnki.csa.008567

      Abstract (414) HTML (729) PDF 1.15 M (819) Comment (0) Favorites

      Abstract:Since the functions and interfaces of the domestic Shenwei basic math library need to be consistent with the libm library of the stand-alone compiler glibc, the basic math library is integrated into glibc for functional tests. Nevertheless, it is detected that some functions have inexact exception (INE) that needs to be eliminated. In response, this study investigates the exception detection mechanism of the glibc math library, analyzes and optimizes the INE of the numerical functions in the basic math library, and proposes a segmentation processing method for test data sets. In this way, such INE is eliminated. Tests show that the segmentation processing method for test data sets can effectively solve the INE of the numerical functions. Compared with the previous exception processing method, the proposed method improves the average performance speedup to 148%.

    • Turbo Boost Method on Loongson Platform Based on System Management Controller

      2022, 31(7):120-127. DOI: 10.15888/j.cnki.csa.008597

      Abstract (545) HTML (792) PDF 1.33 M (909) Comment (0) Favorites

      Abstract:The voltage adjustment of the Loongson processor will affect the normal operation of CPU load. It causes the processors to fail to run under high voltage and then affects the stable high-frequency operation. Accordingly, the turbo boost method based on the system management controller (SMC) is proposed, which makes full use of the SMC to monitor and adjust the voltage and frequency of processor cores in real time. At the same time, for the real-time operation of SMC as well as the fast communication between SMC and the processor cores, the real-time operating system RT-Thread is equipped and the inter-processor communication protocol “Service Request” is designed. Through dynamic frequency and voltage regulation modules as well as the automatic temperature control module, the unified management of the frequency, voltage, and temperature of processor cores is realized. The experimental data on the Loongson 3A4000 platform shows that this method is effective and reliable. Moreover, when the overall power consumption of processors is increased by 25.5%, the overall performance can be improved by 34.2%.

    • Facial Expression Recognition Based on Mixer Layer

      2022, 31(7):128-134. DOI: 10.15888/j.cnki.csa.008554

      Abstract (641) HTML (959) PDF 1.73 M (825) Comment (0) Favorites

      Abstract:At present, most facial expression recognition research uses a convolutional neural network (CNN) to extract facial features and classify them. The disadvantage of CNN is that its network structure is complex and consumes substantial computing resources. In response, this study uses the Mixer Layer network structure based on multilayer perceptron (MLP) for facial expression recognition. Data augmentation and transfer learning methods are employed to solve the problem of insufficient data set samples, and Mixer Layer networks with different layers are built. According to experimental comparison, the recognition accuracy of the 4-layer Mixer Layer network on CK+ and JAFFE data sets reach 98.71% and 95.93% respectively, and that of the 8-layer Mixer Layer network on Fer2013 data set is 63.06%. The experimental results show that the Mixer Layer networks without a convolution structure exhibit sound learning and generalization abilities in facial expression recognition tasks.

    • Intelligent Management System of Classroom Based on LabVIEW and Depth Vision Sensor

      2022, 31(7):135-142. DOI: 10.15888/j.cnki.csa.008586

      Abstract (568) HTML (842) PDF 1.55 M (1036) Comment (0) Favorites

      Abstract:To solve the problems that the existing classroom process management method is backward and that the functions of the current classroom management system are limited, this study designs an intelligent management system that integrates classroom attendance checking, classroom behavior recognition and management, and autonomous learning management. The system uses the depth vision sensor Kinect V2 as its data acquisition device and LabVIEW as its software development platform. It fulfills a classroom attendance checking function based on the combination of face recognition and mobile device positioning, a classroom behavior recognition and analysis function based on skeleton feature extraction and the support vector machine (SVM) classifier, and a student autonomous learning management function based on MyEclipse and MySQL database. The experimental results show that the recognition accuracy of the system for face recognition sign-in reaches 97% and its accuracy of classroom behavior recognition is more than 95%. The database design is reasonable and well established, and the autonomous learning function is flexible and reliable.

    • Fast Upload of Large Files Based on Adaptive Slicing

      2022, 31(7):143-148. DOI: 10.15888/j.cnki.csa.008583

      Abstract (594) HTML (1986) PDF 996.17 K (1181) Comment (0) Favorites

      Abstract:As Internet technology continues to evolve, the application demand for file uploads via the Web is on the rise. Nevertheless, large file uploads are often faced with bandwidth resource constraints, browser crashes, or loading timeouts due to excessive resources, which greatly reduces user experience. In view of the many limitations on large file uploads, this study designs and implements a large file upload system based on Node.js, combining adaptive slicing with concurrent upload to effectively shorten the upload time of such files. It also integrates the element-ui framework to show the upload progress in real time with a progress bar, and hence comes the favorable interactive performance.

    • Power Text Type Recognition Based on Multi-neural Network Cooperation

      2022, 31(7):149-157. DOI: 10.15888/j.cnki.csa.008598

      Abstract (561) HTML (709) PDF 1.77 M (815) Comment (0) Favorites

      Abstract:To realize digital asset management, improve industry operation efficiency, and promote the integration of power informationization, power companies need to implement effective data organization and management methods. This study proposes an efficient text type recognition model based on character-level features for the data in the electric power industry. In this model, characters are put through the BERT pre-training model to generate dynamic and efficient character vectors of the power customer service text. A BiLSTM network with the attention mechanism is used for the input of character vector sequences. The attention mechanism enables the effective capture of the latent features helpful for type recognition. Finally, we use the Softmax layer to recognize the power text type. The model proposed in this study achieves an accuracy of 98.81% on a data set of power customer service text, which is better than traditional neural network methods such as CNN and BiLSTM. It enhances the application of the BERT model and effectively solves the problem of semantic long-distance dependence in power text type recognition.

    • Text Matching Model Incorporating Multi-angle Features

      2022, 31(7):158-164. DOI: 10.15888/j.cnki.csa.008544

      Abstract (566) HTML (904) PDF 1.41 M (898) Comment (0) Favorites

      Abstract:Text matching is a core research area in natural language processing. Deep text matching models can be broadly classified into representational models and interactive models. The former tends to lose semantic focus and fails to measure the contextual importance of words. The latter lacks global information such as sentence type and inter-sentence information. To address these problems, we propose a text matching model incorporating multi-angle features based on Siamese neural network. The model generates word vectors using the BERT model and enhances semantic features by the similarity fusion of words. It then encodes the syntactic structured features using Bi-LSTM, namely the syntactic structured information containing the lexical sequence. A Transformer encoder is utilized to realize the multi-level interaction between the syntactic structured features and the text features. Finally, the similarity is deduced by spliced vectors. Experiments on part of Quora question pair show that this model performs better than the classical deep matching model.

    • Zero-watermarking Algorithm for Three-dimensional Mesh Model Based on Vector Length Ratio

      2022, 31(7):165-171. DOI: 10.15888/j.cnki.csa.008575

      Abstract (533) HTML (705) PDF 1.60 M (20135) Comment (0) Favorites

      Abstract:To improve the robustness of the digital watermarking algorithm, this study proposes a zero-watermarking algorithm for three-dimensional (3D) mesh based on the vector length ratio. First, the 3D mesh model is converted from a rectangular coordinate system to a spherical coordinate system. Second, we divide the vertices of the mesh model into blocks and calculate the centroid of each block. Then, we calculate the ratio of the distance between each vertex and the centroid of the block to the average distance between its neighbor vertex and the centroid and compare the ratio sequence with the average of the ratios. We record the comparison results as 0 or 1 according to the size relation and perform statistics of the majority of the numbers of 0 and 1 in each block. In this way, the characteristic sequence of the model can be constituted. Finally, the binary sequence of the copyright information and the characteristic sequence are XORed to obtain the final zero-watermark sequence. Experimental results show that the algorithm in this study can resist common attacks such as translation, rotation, uniform scaling, noise, smoothing, simplification, and cropping, and can protect the copyright of the 3D mesh model.

    • Flower Classification Based on Improved VGG16 Network Model

      2022, 31(7):172-178. DOI: 10.15888/j.cnki.csa.008582

      Abstract (757) HTML (2012) PDF 1.90 M (1091) Comment (0) Favorites

      Abstract:To further improve the accuracy of flower classification, this study proposes a network model based on visual attention mechanism after the research on the VGG16 network model. Squeeze-and-excitation (SE) attention is embedded in the VGG16 network model to extract salient region features of flowers. BN layer is added following the convolutional layer to effectively prevent gradient explosion and gradient disappearance and to speed up the training and convergence of the network. Multi-loss function fusion is utilized to train the new model. The new model can effectively extract salient regions such as stamens and petals. It can enlarge the distance between and within classes and accelerate the convergence, further improving the classification accuracy. Experimental results show that the new model advances in the classification accuracy of the Oxford-102 dataset after the introduction of the attention mechanism and outstrips the current reference.

    • GAN Speech Enhancement Algorithm with Multi-stage Generator and Time-frequency Discriminator

      2022, 31(7):179-185. DOI: 10.15888/j.cnki.csa.008587

      Abstract (501) HTML (927) PDF 1.23 M (1250) Comment (0) Favorites

      Abstract:The traditional speech enhancement generative adversarial network (SEGAN) takes the waveform of time-domain speech as the mapping target. When it comes to a low signal-to-noise ratio, the waveform of time-domain speech is drowned in the noise, resulting in a dramatic degradation of the enhancement performance of SEGAN and more serious speech distortion. In response, a multi-stage-time-frequency SEGAN (MS-TFSEGAN) is proposed for speech enhancement. MS-TFSEGAN employs multi-stage generators with dual time-frequency discriminators to continuously refine the mapping results. It captures both time- and frequency-domain information at the same time. In addition, for the further enhancement of learning ability in the frequency domain, MS-TFSEGAN introduces L1 loss in the generator loss function. Experimental results show that the speech quality and intelligibility of MS-TFSEGAN are improved by about 13.32% and 8.97%, respectively, compared with SEGAN under low SNR. A relative improvement of 7.3% in CER is achieved when MS-TFSEGAN is used as the front-end of speech recognition.

    • Extractive Multi-document Summarization Based on Multi-granularity Semantic Interaction

      2022, 31(7):186-193. DOI: 10.15888/j.cnki.csa.008562

      Abstract (528) HTML (653) PDF 1.12 M (731) Comment (0) Favorites

      Abstract:Information explosion is a common problem in the information age. In order that valuable information can be extracted rapidly from massive text data, automatic summarization technologies have become a research priority in the field of natural language processing (NLP). The purpose of multi-document summarization is to refine important content from a group of documents on the same topic and thereby help users get key information quickly. To address the problems of incomplete information and high redundancy in multi-document summarizations, this study proposes an extractive summarization method based on multi-granularity semantic interaction that combines the multi-granularity semantic interaction network with maximal marginal relevance (MMR). Semantic interaction with different granularities is used to train sentence representation and key information with different granularities is captured to ensure the comprehensiveness of the summarization. In addition, modified MMR is employed to ensure the low redundancy of the summarization. The sentences in the input documents are scored by learning to rank, and summary sentences are then extracted. Experimental results on the Multi-News dataset show that the proposed extractive multi-document summarization model based on multi-granularity semantic interaction outperforms some baseline models such as LexRank and TextRank.

    • Chinese Speech Recognition Based on Conformer and N-gram

      2022, 31(7):194-202. DOI: 10.15888/j.cnki.csa.008638

      Abstract (854) HTML (1881) PDF 1.49 M (1037) Comment (0) Favorites

      Abstract:The Transformer model can learn important information in the input sequence, which shows higher accuracy compared to the traditional automatic speech recognition (ASR) model. The Conformer model adds a convolution module to the Transformer’s encoder, which increases the ability to obtain subtle local information and further improves the performance of the model. In this study, the Conformer model and the N-gram language model (LM) are used in combination for Chinese speech recognition, and a good recognition effect is obtained. Experiments on the data sets of AISHELL-1 and aidatatang_200zh show that the character error rate of the Conformer model can be reduced to 5.79% and 5.60%, respectively, which is 5.82% and 2.71% lower than that of the Transformer model. Upon the combination with the N-gram LM, the character error rate can be reduced to the optimal performance of 4.86% and 5.10%, respectively, and the real-time factor (RTF) can reach 0.14566. When the test signal-to-noise ratio is reduced to 20 dB, the character error rate of the model drops to 8.58%, which indicates the anti-noise ability of the model.

    • Clothes Detection Using Ghost Convolution and YOLOv5s Network

      2022, 31(7):203-209. DOI: 10.15888/j.cnki.csa.008621

      Abstract (442) HTML (1680) PDF 3.56 M (953) Comment (0) Favorites

      Abstract:To reduce the number of parameters and floating points operations of the object detection model for clothes, we propose an improved object detection model for lightweight clothes, namely G-YOLOv5s. First, the Ghost convolution is used to reconstruct the backbone network of YOLOv5s, and then the data in the DeepFashion2 dataset is employed for model training and validation. Finally, the trained model is applied to the detection of clothes images. The experimental results show that the G-YOLOv5s algorithm achieves the mean average precision (mAP) of 71.7%, with a model volume of 9.09 MB and the floating point operations of 9.8 G FLOPs. Compared with those of YOLOv5s, the model volume of G-YOLOv5s is compressed by 34.8%, and the floating point operations are reduced by 41.3%, with an mAP drop of only 1.3%. Moreover, it is convenient for deployment in equipment with limited resources.

    • Ensemble Pruning Based on Approximate Reducts and Optimal Sampling

      2022, 31(7):210-216. DOI: 10.15888/j.cnki.csa.008605

      Abstract (466) HTML (574) PDF 947.73 K (6379) Comment (0) Favorites

      Abstract:Ensemble learning has been widely used for improving classification accuracy. Recent studies show that building ensemble classifiers through a multi-modal perturbation strategy can further improve classification performance. In this study, we propose an ensemble pruning algorithm based on approximate reducts and optimal sampling (EPA_AO). In EPA_AO, we design the multi-modal perturbation strategy to build different individual classifiers. The proposed perturbation strategy can simultaneously perturb the attribute space and training set, which can improve the diversity of individual classifiers. We use the evidential K-nearest neighbor (KNN) algorithm to train individual classifiers and compare EPA_AO with existing algorithms of the same type on multiple UCI data sets. Experimental results show that EPA_AO is an effective ensemble learning approach.

    • Image Super-resolution Reconstruction Based on Multi-scale Fusion CNN

      2022, 31(7):217-223. DOI: 10.15888/j.cnki.csa.008606

      Abstract (448) HTML (2057) PDF 1.74 M (1272) Comment (0) Favorites

      Abstract:To address the insufficient feature extraction of the FSRCNN model and its artificial redundant information caused by deconvolution, this study proposes an image super-resolution reconstruction algorithm based on a multi-scale fusion convolutional neural network. Specifically, a multi-scale fusion feature extraction channel is designed to cope with the insufficient utilization of image information of different sizes. Then, sub-pixel convolution is used for up-sampling in the image reconstruction part to suppress the artificial redundant information caused by deconvolution. Compared with the FSRCNN model, the algorithm in this study respectively increases the PSNR and SSIM by 0.14 dB and 0.001 0 on average at an amplification factor of 2 and by 0.48 dB and 0.009 1 on average at an amplification factor of 3 on Set5 and Set14 data sets. Experimental results show that the proposed algorithm can better retain the texture details of the image and improve the overall image reconstruction effect.

    • X-ray Security Inspection for Contraband Detection Based on Improved Cascade RCNN Network

      2022, 31(7):224-230. DOI: 10.15888/j.cnki.csa.008624

      Abstract (502) HTML (1153) PDF 1.65 M (1159) Comment (0) Favorites

      Abstract:Considering the low detection rate of X-ray security inspection of contraband, an algorithm based on the improved Cascade RCNN is proposed. By this algorithm, a batch feature erasing (BFE) module is introduced into the network structure, which can enhance local feature learning by randomly erasing the same region and thus further enhance the learning expression of residual features. In addition, the weighted SD loss function is presented in this algorithm to solve the problem of low detection rates, which employs weight fusion to fuse Smooth L1 loss and DIoU loss, and by changing the proportion coefficient of weights, it can make the detection result more accurate. The experimental results show that the detection rate of the improved Cascade RCNN on an open contraband dataset for X-ray security inspection is increased by 3.11% compared with that of the original algorithm, and the accuracy of the improved algorithm is raised.

    • Video Magnification Based on Multi-scale Filtering

      2022, 31(7):231-238. DOI: 10.15888/j.cnki.csa.008608

      Abstract (508) HTML (816) PDF 1.83 M (2797) Comment (0) Favorites

      Abstract:The video magnification technology gives people an opportunity to observe and study small changes in things. A video is decomposed by the complex steerable pyramid and the motion in the video is manipulated by analyzing the phase differences at different scales and in different directions. Magnified signals at some scales in the complex steerable pyramid exceed their phase shift limits, resulting in artifacts and blurring. This problem can be solved by adjusting the magnification factor for each scale. In this study, a video magnification algorithm based on multi-scale filtering is proposed. A relation between the spatial wavelength of a video frame image and vibration displacement is established to determine the upper limit of the magnification factor for each scale. The manual setting of a cut-off wavelength is no longer necessary, and the amplification factor for each scale in the complex steerable pyramid is adjusted automatically so that magnified signals can adapt to their phase shift limits. The vibration attached to the large motion is magnified to verify the performance of the proposed algorithm, and the result shows that it has obvious advantages over the existing video amplification algorithms.

    • Fault Classification Based on K-means Bayes and AdaBoost-SVM

      2022, 31(7):239-246. DOI: 10.15888/j.cnki.csa.008585

      Abstract (509) HTML (508) PDF 1.56 M (891) Comment (0) Favorites

      Abstract:Traditional fault classification methods mostly assume similar or equal sample sizes for different types of data. However, the bulk of data collected in the actual industrial process is normal with a minority belonging to fault data, which causes data imbalance. Aiming at the imbalanced data, this study proposes the fault classification method combining K-means Bayes with AdaBoost-SVM. Two independent classifiers are designed with the D-S evidence theory to merge the classification results, so as to make up for their weak classification capabilities for certain categories. Experiments show that the fault classification method proposed in this study has higher classification accuracy than single Bayes or SVM.

    • Vehicle OD Trajectory Restoration Based on Automatic License Plate Recognition Data

      2022, 31(7):247-252. DOI: 10.15888/j.cnki.csa.008573

      Abstract (571) HTML (2004) PDF 1.42 M (1014) Comment (0) Favorites

      Abstract:The license plate data captured by a traffic information collection device is a natural carrier for studying the trajectory of a vehicle and is useful to track and restore the complete trajectory of the vehicle on the road network. However, due to the limitations of technology and device coverage, the collected time series data of license plates is inevitably sparse and incomplete. To make full use of license plate data, this study proposes an origin-destination (OD) trajectory restoration algorithm based on sparse license plate data. The algorithm first separates the OD trip chain of the vehicle by the interval time threshold. Then, it generates multiple approximate candidate trajectories with the K-shortest path (KSP) algorithm. Finally, a variational autoencoder (VAE) is used to select the optimal estimated trajectory for decision-making so that the complete trajectory of the vehicle can be obtained. This method has been implemented and verified in an actual transportation analysis zone in Xiaoshan District, Hangzhou City. The results show that the proposed method achieves a comprehensive accuracy of 95%. Additionally, the reconstruction method still has sound performance (higher than 50%) in the case of a high node loss rate and low camera coverage.

    • Performance Evaluation Method for Front-line Staff in Electric Power Company Based on PSO Algorithm

      2022, 31(7):253-258. DOI: 10.15888/j.cnki.csa.008564

      Abstract (423) HTML (510) PDF 994.36 K (753) Comment (0) Favorites

      Abstract:Performance evaluation of the front-line staff in an electric power company often encounters problems such as hard assessment for the raters and undue quantification. In response, this study proposes a model of performance evaluation based on work tickets. Vertical quantification of counting is performed for the same work item through simple characterization of multi-dimensional evaluation attributes, and horizontal clustering is conducted for different work items according to the weights of the subjective assessment by team leaders. In this way, the model tries to explore the hidden information in the raters’ subjective assessment. Meanwhile, a dynamic random population topology-particle swarm optimization (RPT-PSO) algorithm based on the average degree is proposed to solve the model. In-depth research is conducted on the encoding modes of particles, the way that constraints are handled, and the specific implementation of strategies in the algorithm. Finally, five teams of the same type are selected for performance measurement by the proposed model. The computational results demonstrate that the proposed model and the RPT-PSO algorithm are effective, providing a new solution to the performance evaluation of the front-line staff in an electric power company.

    • Automatic Detection of Human Eye Pupil Distance Based on Gray Integral Projection and Hough Circle Transform Algorithm

      2022, 31(7):259-264. DOI: 10.15888/j.cnki.csa.008604

      Abstract (416) HTML (1312) PDF 1.24 M (7975) Comment (0) Favorites

      Abstract:The pupil distance of human eyes is a basic parameter which needs to be measured accurately in optometry research and glasses prescription. It is of great application value to realize automatic pupil distance detection. With the help of a Gaussian skin color model, this study proposes a method of pupil location and pupil distance calculation based on gray integral projection and Hough circle transform. Firstly, the image is subjected to the preprocessing of illumination compensation by the adaptive brightness correction method of a two-dimensional gamma function. On this basis, the face area is extracted by using a skin color probability model. Then, the gray integral projection method is used to extract the eye area by selecting an appropriate threshold. After morphological image processing, the Canny edge detection operator is utilized to detect the pupil edge. Finally, the pupil location is carried out by Hough circle transform, and the center distance is calculated. The results show that the algorithm has a fast pupil location speed and can realize the automatic pupil distance measurement more accurately, which lays the foundation for the research on intelligent technologies of pupil distance measurement.

    • LSCN Infrared Image Enhancement Method Based on Redundant Lifting Non-separable Wavelet Transforms

      2022, 31(7):265-271. DOI: 10.15888/j.cnki.csa.008578

      Abstract (637) HTML (633) PDF 1.23 M (19208) Comment (0) Favorites

      Abstract:To address the low signal-to-noise ratio of infrared images and their vulnerability to the impact of the background environment, this study proposes an image enhancement algorithm based on non-separable wavelet based multiscale directional analysis (NSWMDA) and linking synaptic computation network (LSCN). Firstly, the original image is subjected to non-separable wavelet transform (NSWT) with redundant lifting, which yields a high-frequency detail subband and a low-frequency approximation subband. Then, the high-frequency detail subband is filtered in multi-direction before its enhancement by the LSCN algorithm, while the low-frequency approximation subband is directly enhanced by the LSCN algorithm. Finally, the processed sub-images are fused and reconstructed to constitute the enhanced infrared image. In the infrared image of a power transformer, the edge strength, information entropy, peak signal-to-noise ratio, structural similarity and average gradient of the proposed algorithm are at least 10.86%, 14.39%, 19.95%, 7.06%, and 6.70% higher than those of other algorithms. The experimental results show that the algorithm not only improves the overall clarity of the infrared image but also strengthens the detail texture and contrast of the image. It has a good infrared image enhancement effect and thus a bright application prospect for power equipment detection.

    • Safety Helmet Wearing Detection and Type of Work Identification Based on YOLOv4

      2022, 31(7):272-277. DOI: 10.15888/j.cnki.csa.008561

      Abstract (675) HTML (1293) PDF 1.42 M (21683) Comment (0) Favorites

      Abstract:On a construction site, safety helmets can reduce head injuries, and safety helmets of different colors represent different identities. The contemporary method of detecting safety helmet wearing and identifying types of work by video surveillance is time-consuming, incomplete, and low in supervision efficiency. In response, this study proposes an improved method of safety helmet wearing detection and identity recognition based on the you only look once version 4 (YOLOv4). On the basis of the original YOLOv4, the K-means algorithm is used to cluster the size of the prior box again, and multi-scale prediction output is added. The experimental distance intersection over union–non-maximum suppression (DIoU–NMS) is used for NMS so that safety helmet wearing detection and identity recognition of workers can achieve high efficiency and comprehensiveness. The results show that the average detection accuracy among workers wearing red, blue, yellow, and white safety helmets and workers without safety helmets is 92.1%, which means the proposed method ensures the real-time monitoring of the safety helmet wearing of workers on the construction site.

    • Segmentation of Sinter Pores Based on Improved UNet Network

      2022, 31(7):278-284. DOI: 10.15888/j.cnki.csa.008559

      Abstract (597) HTML (1687) PDF 2.07 M (920) Comment (0) Favorites

      Abstract:In the process of sinter production, the pores formed in the sinter are an important parameter for sinter quality evaluation. Due to the different shapes of sinter pores and their fuzzy edge, the error rates of pores segmented are large. In order that the pores can be segmented more accurately, the sinter image is preprocessed by OpenCV. Through a comparative analysis of the traditional image segmentation algorithm, this study proposes an algorithm based on an improved UNet network to segment the preprocessed sinter pore image. An improved module integrating residual and splicing is introduced into UNet network coding to obtain more information on pore features. The experimental results show that the improved algorithm is better than both the traditional UNet network and the traditional image segmentation algorithm in mean intersection over union (MIoU) and the Dice index.

    • Image Denoising Using Graph-based Filtering

      2022, 31(7):285-289. DOI: 10.15888/j.cnki.csa.008596

      Abstract (437) HTML (862) PDF 1.37 M (812) Comment (0) Favorites

      Abstract:In the field of image processing, image denoising is quite challenging. The development of graph signal theory provides a new perspective for us to solve this problem. In this study, the weight matrix and the Laplace matrix based on a graph signal method are studied, and they are used for the objective function of image denoising. These two matrices can well define the internal relationship between the observed image and the expected image. After proposing the denoising objective function, we give the optimal solution and an iterative fast solution algorithm. Experiments show that this method is superior to cutting-edge denoising methods such as BM3D and WNNM.

    • Geological Text Topic Model Based on Data Augmentation

      2022, 31(7):290-297. DOI: 10.15888/j.cnki.csa.008563

      Abstract (502) HTML (595) PDF 1.54 M (789) Comment (0) Favorites

      Abstract:Problems such as low topic accuracy and poor continuity of topic keywords occur when geological texts are directly clustered by topic models. This study adopts relevant improvement methods. In the word segmentation stage, the repeated word string extraction algorithm based on word frequency statistics is adopted. Geological terms are retained to accurately extract text topics, and redundant word strings are reduced to save memory costs. In this way, the efficiency of retained word extraction is improved. In addition, a text data augmentation algorithm based on term frequency-inverse document frequency (TF-IDF) and word vector is used to process the original word segmentation corpus and thereby strengthen the text topic features. Then, the algorithm is combined with the topic model to extract the corpus topics on the processed corpus. The performance of the model is improved due to its enhanced prior information. The experimental results show that the method combining the proposed algorithm with the latent Dirichlet allocation (LDA) model performs well, superior to other methods in all the related indexes and output results.

    • 3D Reconstruction for Small Motion Clips

      2022, 31(7):298-306. DOI: 10.15888/j.cnki.csa.008577

      Abstract (476) HTML (862) PDF 5.80 M (685) Comment (0) Favorites

      Abstract:When a user takes a photo, a small motion of image frames is usually induced by hand shaking. On the one hand, the small motion contains parallax information, which is valuable for scene depth perception and can be potentially used in many applications, such as VR/AR and photo refocusing. On the other hand, due to narrow baselines, corresponding point matching of images is sensitive to noise, as a result of which scene reconstruction from uncalibrated small motion clips is quite challenging. Existing state-of-the-art methods for 3D reconstruction from small motion clips are generally less accurate since they do not consider the uncertainties. In this study, we propose a high-accuracy method for 3D reconstruction from uncalibrated small motion clips. The proposed method consists of two key steps. Firstly, in the self-calibration stage, we propose a viewpoint-weighted bundle adjustment method that fully considers the matching uncertainties of different neighboring viewpoints due to different baselines and assigns smaller confidence to the viewpoints with narrower baselines, thereby keeping the robustness during self-calibration. Furthermore, we present a TGV-based depth image estimation method that can alleviate noise caused by narrow baselines while maintaining slanted structures and detailed geometric features. The quantitative and qualitative experiments on public datasets and synthetic datasets clearly demonstrate the effectiveness of the proposed method in comparison with state-of-the-arts.

    • User Interface Layout Optimization Incorporating Cognitive Load

      2022, 31(7):307-315. DOI: 10.15888/j.cnki.csa.008623

      Abstract (372) HTML (573) PDF 1.62 M (944) Comment (0) Favorites

      Abstract:A design method integrating cognitive loads is proposed to optimize the interfaces of project management systems and improve the usability of the interfaces. This method combines objective cognition laws with subjective cognition of users to form layout design principles and then transform them into a calculable multi-objective optimization function for interface layout. On this basis, the sparrow search algorithm (SSA) encodes the objective function to seek the optimal solution in space and obtain the layout optimization plan. The homepage of a railway project management platform is taken as an example to verify the effectiveness of the method in this study. Specifically, apart from the optimization of the proposed method, the physiological measurement method of eye movement is introduced into the evaluation process, and the usability test is carried out to measure eye movement loads. The experimental results show that the optimized interface layout has been significantly improved regarding the interface usability index, which proves the effectiveness and practicability of the method.

    • Application of Interactive Genetic Algorithm in Interior Layout Design

      2022, 31(7):316-324. DOI: 10.15888/j.cnki.csa.008589

      Abstract (495) HTML (729) PDF 2.67 M (673) Comment (0) Favorites

      Abstract:Interior graphic design is the first step in the process of decorating a rough house delivered by a real estate developer, floor plan design that meets the individual needs of users is a typical implicit optimization problem. The communication between designer and users is difficult to accurately obtain user needs, and the cost is extremely high. Based on the theory of optimal rules, the human-computer interaction mode in the interactive genetic algorithm is used to replace the cumbersome fitness function design with subjective evaluation to solve the above challenges. Experiments show that the algorithm can effectively meet the personalized needs of customers in the process of floor plan design. The improved interactive genetic algorithm introduces the subjective evaluation of users into the traditional design process, so that users can really participate in the design work.

    • Application of Stacking Ensemble Learning Model in Blended Performance Classification and Prediction

      2022, 31(7):325-332. DOI: 10.15888/j.cnki.csa.008572

      Abstract (439) HTML (1764) PDF 1.46 M (1105) Comment (0) Favorites

      Abstract:To tackle the problem that existing single algorithm models have poor generalization ability in performance prediction, this study proposes a Stacking ensemble learning model based on multi-algorithm fusion for the prediction of students’ performance in blended teaching. The model uses polynomial naive Bayes, AdaBoost and Gradient boosting as primary learners and logistic regression as secondary learners to form a two-level fusion framework. The model is verified by the learning behavior data generated in the process of blended teaching. Experimental results show that the classification and prediction accuracy of the Stacking ensemble learning model on the test set reaches 76%, which is 5%, 6%, 9% and 6% higher than that of the four single algorithm models of polynomial naive Bayes, AdaBoost, Gradient boosting and logistic regression, respectively. Compared with these single algorithm models, the Stacking ensemble learning model has strong generalization ability, which can better predict students’ performance and provide a reference for the learning warning of blended teaching.

    • Automatic Fruit Recognition Based on Attention YOLOv5 Model

      2022, 31(7):333-340. DOI: 10.15888/j.cnki.csa.008576

      Abstract (1376) HTML (4473) PDF 2.30 M (7121) Comment (0) Favorites

      Abstract:In recent years, artificial intelligence has been widely used in various fields. To address time-consuming manual weighing and complicated pricing procedures in supermarkets and vegetable markets, this study proposes an automatic fruit recognition model based on attention YOLOv5. First, to improve the recognition accuracy of fruits with different local features but similar global features, the study adds squeeze-and-excitation networks (SENet) after the spatial pyramid pooling (SPP) layer of YOLOv5 and uses the attention mechanism to automatically learn the importance of each feature channel. Further, the useful features for fruit recognition tasks according to the importance are strengthened and those useless are suppressed. Second, when the fruit recognition prediction frame overlaps the target frame, GIOU cannot accurately express the overlapping relationship of the frames. In response, this study replaces the original frame regression loss function GIOU with CIOU and considers the relationships of aspect ratio and center point between the target frame and the prediction frame. In this way, the fruit prediction frame is closer to the real frame, and thereby the prediction accuracy is improved. Experimental results show that the improved model has significantly improved fruit recognition ability in common scenarios with a mean average precision (mAP) of 99.10% and a recognition speed of 82 FPS, which can meet the needs of practical applications.

    • Crack Detection of Asphalt Pavement Based on Improved Faster-RCNN

      2022, 31(7):341-348. DOI: 10.15888/j.cnki.csa.008594

      Abstract (886) HTML (1899) PDF 1.97 M (24933) Comment (0) Favorites

      Abstract:Given the low recognition rate and the difficulty in detecting small cracks in the asphalt pavement under complex background, the crack detection method based on improved Faster-RCNN is proposed. First, the road surface images are collected by the multifunctional road detection vehicle, and 13 000 pictures are divided into training sets and test sets at a ratio of 8:2. Then VGG16, MobileNet-V2, and ResNet50 networks are utilized to replace the feature extraction network in the Faster-RCNN model to identify the cracks. The results show that the combination of ResNet50 and Faster-RCNN can achieve the best result with an accuracy of 0.805 8. The cracks are distributed on the same level without hierarchical information. Therefore, other ResNet networks are expected to work better with the Faster-RCNN model. However, it turns out that ResNet50 still outperforms ResNet18 and ResNet101. In the case of missed detection of small cracks, the convolutional block attention module (CBAM) module is also introduced into ResNet50 and the influence of different insertion positions on the detection accuracy is compared. Experiments show that the improved Faster-RCNN model has a detection accuracy of 85.64%, which can effectively detect small cracks under complex backgrounds.

    • Deep Feature Fusion Method and Its Application in Leaf Disease Recognition

      2022, 31(7):349-355. DOI: 10.15888/j.cnki.csa.008549

      Abstract (504) HTML (743) PDF 1.39 M (844) Comment (0) Favorites

      Abstract:Automatic recognition of crop leaf diseases is an important application of computer vision technology in agriculture. In recent years, deep learning methods have made some progress in the recognition of crop leaf diseases, and they are all based on deep feature representations of a single deep convolutional neural network (CNN) model. However, the useful fact that the image representation ability of different deep CNN models is complementary has not received attention for research. Thus, this study proposes a network model MDFF-Net for fusing different deep features. MDFF-Net connects two pre-trained deep CNN models in parallel and then sets a fully connected layer with the same number of neurons for each model to transform the deep features output by different models into features with the same dimension. Then, through the non-linear transform of two fully connected layers, the effect of feature fusion is further improved. We choose VGG-16 and ResNet-50 as the feature extractors of MDFF-Net and conduct experiments on a public dataset containing five apple leaf diseases. The experimental results show that the recognition accuracy of MDFF-Net is 96.59%, which is better than the results achieved by VGG-16 or ResNet-50 alone and thus proves the effectiveness of the deep feature fusion method.

    • Cloud-edge Collaboration Technology for Power Grid PMU Time Series Data

      2022, 31(7):356-364. DOI: 10.15888/j.cnki.csa.008558

      Abstract (465) HTML (1264) PDF 1.80 M (1124) Comment (0) Favorites

      Abstract:The phasor measurement unit (PMU) time series data is effective in supporting the safe and stable operation of the power grid. Nevertheless, as the power grid dispatching and control cloud is constructed, services such as statistical analysis of data on power grid operation request the sharing of PMU time series data on the cloud. This study proposes a cloud-edge collaboration technology for such data. The technology builds an architecture of a data collaboration model according to the storage and access characteristics of PMU time series data so that data of this type are collected from the nodes at the edge of the production control area in the power grid dispatching system to the dispatching and control cloud and shared in a model manner. The feasibility of the proposed technology is verified through implementation and testing in a test environment.

    • Prediction Method of PCC-BiLSTM-GRU-Attention Combined Model

      2022, 31(7):365-371. DOI: 10.15888/j.cnki.csa.008580

      Abstract (574) HTML (967) PDF 1.27 M (957) Comment (0) Favorites

      Abstract:When dealing with multivariate time series, traditional prediction models are often difficult to capture the complex variation of nonlinear dynamic systems, which results in low prediction accuracy. To solve this problem, this study discusses and verifies the prediction method of the PCC-BiLSTM-GRU-Attention combined model. In the method, Pearson correlation coefficient (PCC) is first used for correlation tests and irrelevant features are deleted to achieve dimensionality reduction and optimization of multivariate data. Then, bidirectional long short-term memory (BiLSTM) neural network is used to extract time series features. Finally, GRU neural network is integrated with the attention mechanism to further learn the change rule of bidirectional time series features and accurately capture the critical moment information. To verify the feasibility of this method in multivariate time series, this study takes stock price prediction as the experimental scene and compares it with the BP model, LSTM model, GRU model, BiLSTM-GRU model and BiLSTM-GRU-Attention model. The verification results show that the prediction method of the PCC-BiLSTM-GRU-Attention combined model has higher prediction accuracy than other models, with the mean absolute percentage error (MAPE) reaching 2.484% and the determination coefficient 0.966.

    • Vessel Segmentation in Retinal Image Based on Retina-GAN

      2022, 31(7):372-378. DOI: 10.15888/j.cnki.csa.008607

      Abstract (470) HTML (1470) PDF 1.60 M (1994) Comment (0) Favorites

      Abstract:For finding the ophthalmic diseases that can be observed from retinal vessels, fundus images play a key role and provide an effective reference for professional medical personnel. However, manual vessel segmentation has a large workload, which is time-consuming and laborious. Therefore, developing an automatic and intelligent segmentation method is of great benefit to relevant personnel. In this study, the attention mechanism and RU-Net structure are integrated into the generator of generative adversarial networks (GANs), forming a new structure—Retina-GAN. At the same time, automatic color equalization (ACE) is selected in the preprocessing of fundus images to improve image contrast and make blood vessels clearer. To validate the proposed approach, we compared the Retina-GAN with some other models on DRIVE datasets. Accuracy, sensitivity, and specificity are measured for comparative analysis. The experiment shows that Retina-GAN has better performance than other models.

    • Prediction of Tunnel Subsidence Based on Bayesian Optimized XGBoost

      2022, 31(7):379-385. DOI: 10.15888/j.cnki.csa.008588

      Abstract (491) HTML (903) PDF 1.26 M (1019) Comment (0) Favorites

      Abstract:Highway tunnels are susceptible to the influence of the geographical environment and other factors during the construction. The instability of the mountain structure may cause potential safety hazards, and tunnel subsidence is an important indicator of changes in the tunnel structure. Therefore, the model based on Bayesian optimized XGBoost is proposed to predict the tunnel subsidence. The complexity and serious interference of the tunnel construction scene hamper data collection and subsequent change analysis of subsidence. First, the time scale of the original subsidence monitoring data is unified. In accordance with time domain and spatial domain information, the outliers and missing values are repaired. Finally, the integrated Bayesian optimized XGBoost model is used to analyze the peripheral convergence, surface subsidence, and vault subsidence. Compared with the original XGBoost model and long short-term memory (LSTM) model, the Bayesian optimized XGBoost model has the highest accuracy. The average prediction accuracy of vault subsidence, surface subsidence, and peripheral convergence can reach 0.979 4. This model can effectively monitor and predict the change process of tunnel subsidence, which is of importance for practical application during the supervision of tunnel safety.

    • Tree Species Identification by Visible Image from Pre-trained Model and UAV

      2022, 31(7):386-391. DOI: 10.15888/j.cnki.csa.008618

      Abstract (513) HTML (848) PDF 1.72 M (745) Comment (0) Favorites

      Abstract:Enhancing image quality and adopting new image classification methods are two breakthrough points to improve the accuracy of tree species identification by remote sensing images. The research focuses on the identification of Chinese fir and Masson pine by the pre-trained model of VGG16 and unmanned aerial vehicle (UAV) visible images. The DJI Phantom 4RTK UAV with an FC6310R camera is used to collect color images of artificial pure forests of Chinese fir and Masson pine in Nanping and Sanming cities. Then, two datasets UAVTree2k and UAVTree20k are constructed through image preprocessing, annotation, cropping, and enhancement. Furthermore, three full connection layers and Sigmoid layer are trained by the UAVTree2K dataset and the pre-trained model of VGG16 on the ImageNet dataset to investigate the effects of the number of iterations, batch size, partition ratios of the training set and the test set on identification accuracy. The results show that when the number of iterations is 40, the batch size is 16, and the ratio between the training set and the test set is 6:4, the identification effect of the model is best, and the test accuracy reaches 98.63%. Meanwhile, the VGG16-based pre-trained model has a good feature learning ability for a small sample size.

Current Issue


Volume , No.

Table of Contents

Archive

Volume

Issue

联系方式
  • 《计算机系统应用》
  • 1992年创刊
  • 主办单位:中国科学院软件研究所
  • 邮编:100190
  • 电话:010-62661041
  • 电子邮箱:csa (a) iscas.ac.cn
  • 网址:http://www.c-s-a.org.cn
  • 刊号:ISSN 1003-3254
  • CN 11-2854/TP
  • 国内定价:50元
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063