2023, 32(6):1-11. DOI: 10.15888/j.cnki.csa.009116
Abstract:Most aspect level sentiment analysis methods do not focus on keyword features in the local context. Therefore, this study proposes an aspect level sentiment analysis model LCPM (local context pos mask) based on local context keyword feature extraction and enhancement. First, a local context part of the speech mask mechanism is proposed to extract the important words features around aspect words and reduce the interference of noise words. Second, the loss function is modified, so that the model focuses on the local context keyword features related to aspect words and improves the performance of the model’s sentimental classification. Finally, a gating mechanism is designed. The model can dynamically learn the weight coefficients and assign different weight coefficients to local context keyword features and global context features. The experiments on four open datasets show that, compared with existing aspect level sentiment analysis models, the proposed model has higher accuracy and MF1 value, which verifies the effectiveness of local context keyword extraction and enhancement and is of application significance in aspect level sentiment analysis tasks.
2023, 32(6):12-21. DOI: 10.15888/j.cnki.csa.009138
Abstract:To tackle the problem that the existing face image translation models cannot realize the translation among multiple visual attributes and the translated face images are not clear and natural, this study proposes a multi-attribute face image translation model based on the face recognition method. The model is mainly composed of the content and style encoder, AdaIN decoder, and face recognition module. First, the two encoders extract the potential encoding of the content and style image and then send the encoding into the AdaIN layer for affine transformation, and finally the decoder restores the translated image. A face recognition model is designed and trained using this method with an accuracy rate of 90.282%. A joint face attribute loss function is proposed, which enhances the model’s attention to the attributes of the style face, solves the problem that the model cannot accurately extract the attribute information of the face, and discards irrelevant information so that the model can generate clear, multi-attribute, and diverse face translation images. This method is tested on the open dataset CelebA-HQ, whose results are higher than the baselines in terms of both quantitative and qualitative indicators. It also shows good robustness in different face orientations. The image generated by the model can also be used in the field of face image generation to address dataset shortage.
2023, 32(6):22-31. DOI: 10.15888/j.cnki.csa.009118
Abstract:For processing graph data, a variety of graph neural network approaches have been developed; however, most research focuses on the convolutional layer for feature aggregation rather than the pooling layer for downsampling. Additionally, the computation of assignment matrices is required for the pooling approach to creating clusters, and the pooling method for node scores simply employs one scoring strategy. A new multi-dimension graph pooling operator, MDPool, is presented to solve these issues and increase the precision of graph classification tasks. To calculate node scores in various dimensions, the model makes use of information on node features and graph structure. The score weighting across several dimensions is summarized by using an attention technique to provide more reliable node rankings. The set of nodes is chosen to produce induced subgraphs based on the node rankings. The proposed MDPool can be implemented into a variety of graph neural network architectures. The encode-decode model, EDMDPool, is created by stacking the MDPool pooling operator with the convolutional layer of the graph neural network. In the graph classification tasks of four public datasets, EDMDPool performs better than the existing baseline model.
2023, 32(6):32-41. DOI: 10.15888/j.cnki.csa.009137
Abstract:Price forecasting is important for the stability of bulk agricultural commodity markets, but bulk agricultural commodity prices have complex correlations with multiple factors. In order to address the current problems of strong dependence on data integrity and the difficulty of single models to fully utilize multiple data features in price forecasting, a boosting ensemble learning method that combines the attention mechanism-based convolutional bi-directional long short-term memory neural network (CNN-BiLSTM-Attention), support vector regression (SVR), and LightGBM is proposed, and experiments are conducted on the datasets containing historical trades, weather, exchange rate, oil price, and other features data, respectively. The experiment takes the price forecasting of wheat and cotton as the target task, uses the mutual information method for feature selection, selects the CNN-BiLSTM-Attention model with low error as the base model, and performs boosting ensemble learning with the machine learning model through linear regression. The experimental results show that the root mean square error (RMSE) of the ensemble learning method is 12.812 and 74.356 for wheat and cotton datasets, which are 11.00%, 0.94%, 4.44%, 1.99%, 13.03%, and 4.39% lower than the three base models, respectively. The method can effectively improve the accuracy of price forecasting.
2023, 32(6):42-50. DOI: 10.15888/j.cnki.csa.009114
Abstract:To address the problem that incremental intrusion detection algorithms do not classify old category data with high accuracy due to catastrophic forgetting of old knowledge, this study proposes an incremental intrusion detection algorithm (ImFace) based on asymmetric multi-feature fusion auto-encoder (AMAE) and fully connected classification deep neural network (C-DNN). In the incremental learning phase, ImFace trains an AMAE model and a C-DNN model for each new batch of the dataset. At the same time, this study solves the problem of C-DNN’s insufficient ability to detect certain categories of data due to unbalanced datasets by oversampling the data through a variational auto-encoder (VAE). In the detection phase, ImFace makes the input data pass through all AMAEs and C-DNNs and then uses the result of AMAEs as the confidence level to select the output result of a C-DNN as the final result. In this study, the CICIDS2017 dataset is used to test the effectiveness of the ImFace algorithm. The experimental results show that the ImFace algorithm not only retains the ability to classify old categories but also has a high detection accuracy for new categories of data.
2023, 32(6):51-59. DOI: 10.15888/j.cnki.csa.009156
Abstract:Ramps are crucial to offshore platforms, and their absence will cause great safety risks to operation sites. To eliminate such risks, this study proposes a detection method of ramp setting up in the berthing row scenario. The method is divided into three parts: firstly, using the object detection algorithm to locate and mark the target; then, extracting the external edge of the marked target area by edge detection; finally, formulating the specific safety measures discrimination algorithm to identify violations in the work site. To solve the detection problems of small targets, this method improves the YOLOv5 and introduces an attention module in feature extraction and feature fusion, which makes the model more lightweight while improving its mean average precision (mAP) from 53.1% to 54.5%. As to rough edge detection, the loss function of the edge detection network PiDiNet is improved. Compared with the original network, the false detection rate decreases from 8.9% to 5.4%. The verification results indicate that the method can be used to detect whether the ramp is set up correctly within the effective time with accuracy up to 91.5%.
2023, 32(6):60-69. DOI: 10.15888/j.cnki.csa.009133
Abstract:Anomaly detection system plays a significant role in cyberspace security and provides effective protection for network security. Regarding complex network traffic information, the traditional single classifier is often unable to ensure high detection accuracy and strong generalization ability at the same time. In addition, the anomaly detection model based on full features is often disturbed by redundancy features, which affects the accuracy and efficiency of detection. To address these problems, this study proposes a feature selection and ensemble learning model based on average feature importance. The decision tree (DT), random forest (RF), and extra tree (ET) are selected as the base classifiers to establish a voting ensemble model, and the average feature importance of the base classifiers is calculated based on the Gini coefficient for feature selection. The experimental evaluation results on several datasets show that the proposed model is superior to the classical ensemble learning models and other well-known anomaly detection ensemble models. The proposed model can improve the accuracy of the ensemble model by about 0.13% and save about 30% of training time on average.
2023, 32(6):70-79. DOI: 10.15888/j.cnki.csa.009145
Abstract:As an extension of software services in the information world to the real world through the Internet of Things (IoT), IoT services have an important role in IoT systems. However, unlike traditional Web services, IoT services have new characteristics such as reality perception, data driving, heterogeneous distribution, and spatio-temporal correlation, which make the existing service models insufficient for an effective portrayal of IoT services and fail to meet the requirements of subsequent service discovery, service offloading, and service combination in IoT applications. On the basis of condensed analysis of IoT service modeling requirements and existing IoT service models, an entity-data-based IoT service modeling framework is proposed, which puts forward the concept of IoT service models fusing ternary information of service, entity, and data and their conceptual relationships. In addition, the framework focuses on defining the spatio-temporal attributes and dependencies of service, entity, and data to support IoT service association representation and analysis based on spatio-temporal correlation, and the entity-data-based IoT service description method is given by extending ontology Web language for services (OWL-S). Finally, the usage mode and effects of the model are discussed in the context of a highway IoT application case.
2023, 32(6):80-90. DOI: 10.15888/j.cnki.csa.009134
Abstract:Automatic underwater object detection methods play an important role in intelligent marine fishing. To address the problem that the existing object detection methods are not accurate enough for underwater biological detection, this study proposes an underwater object detection method based on the GA-RetinaNet algorithm. Firstly, according to the existence of dense objects in underwater images, the study introduces group convolution to replace ordinary convolution, which can provide more feature information without increasing the complexity of parameters and thereby improve the accuracy of the model. Secondly, according to the characteristic that underwater objects are mostly small objects, the attention-guided context feature pyramid network (AC-FPN) is introduced. The context extraction module is used to obtain more receptive fields while guaranteeing high-resolution inputs and thus extract more contextual information. The context attention module and the content attention module are utilized to capture useful features for the accurate positioning of the object. Experimental results show that the improved GA-RetinaNet algorithm enhances the detection accuracy by 2.3% compared with the original algorithm when the URPC2021 dataset is selected. Compared with other mainstream models, the GA-RetinaNet algorithm achieves better detection results for different types of underwater objects, and the detection accuracy is greatly improved.
2023, 32(6):91-98. DOI: 10.15888/j.cnki.csa.008950
Abstract:Considering the high cost and poor timeliness of manual testing and the insufficient scalability of conventional automatic interface testing tools, an interface test platform (OLa) that supports the parallel testing of test case sets is proposed in this study. OLa is divided into the user presentation layer, application logic layer, data service layer, and test case execution layer according to a layered architecture model. Among them, the user presentation layer is developed on the basis of the Vue framework, which is combined with Vue Router, Vuex, and other tools to realize the single-page application. The application logic layer is realized with the Spring Boot framework, and the data service layer is based on the MyBatis-Plus framework and Spring Data framework. The test case execution layer uses okhttp3, fastjson, Jackson, and other tools to implement interface testing. In addition, according to the ideas of the technical architecture of systems, Java network programming, and abstraction-oriented programming, this study innovatively proposes the test case execution process with the C/S architecture and the automatic matching verification method based on parameter identification, which can solve the problem that some traditional automatic testing tools cannot support parallel testing. The experimental results show that OLa can support test case testing, serial testing, and parallel testing of test case sets and can automatically identify test case parameters and verify the interface response content, which improves the flexibility and effectiveness of interface testing. Moreover, it can reduce the difficulty of interface testing and improve the efficiency of interface testing without interdependence between test cases.
2023, 32(6):99-106. DOI: 10.15888/j.cnki.csa.009122
Abstract:The user persona is a sketch and description of the user image, which has been widely used in typical retail scenarios such as the wake-up of sleeping members, prediction of users arriving at the store, and personalized recommendations. Drugs are different from ordinary commodities, and they contain strong semantic knowledge. The existing user persona mainly starts from the consumption attribute and static attribute and is not completely applicable to the pharmacy marketing and prediction field. This study proposes a persona of pharmacy user (UPP) model for the drug field, which embeds medical knowledge on the existing persona and uses methods such as rules, clustering, statistics, and entity recognition to extract new labels including chronic diseases, diseases, special diseases, activity sensitivity, user value, and price preference. All labels are integrated into a clustering-based group division method to form the user profile. The experiment shows that the accuracy of this model is 13% higher than the existing user persona model in the consumer behavior prediction scenario, so the proposed model is more suitable for the pharmacy marketing scenario.
2023, 32(6):107-114. DOI: 10.15888/j.cnki.csa.009115
Abstract:As weather radar technology develops, traditional monolithic radar product generation systems are unable to meet the requirements of rapid detection rate and multi-source data in new weather radar systems. In this study, a distributed radar product generation system based on message scheduling is proposed to improve the real-time and scalability of the system. Based on ActiveMQ message middleware, a task scheduling strategy of the distributed cluster is designed. The Ceph file system is employed to build a unified and efficient storage service, and the parallel acceleration of the meteorological algorithm is implemented by GPU. At present, this system has been delivered and applied in actual projects and has yielded good results. Therefore, it is of certain promotion value and application significance.
2023, 32(6):115-120. DOI: 10.15888/j.cnki.csa.009140
Abstract:Nowadays, a large amount of medical domain knowledge on the Internet can be used for medical diagnosis, but traditional search engines cannot make reasonable judgments based on the actual situation of patients and fail to meet the needs of use. Therefore, this study mainly develops a question-answering system based on a knowledge graph. The system is applied to the medical field, which uses crawler technology to obtain a large amount of medical data and stores them in the constructed medical knowledge graph of the Neo4j graph database. At the same time, in order to enable the system to further understand the user’s medical questions, this study proposes methods based on BERT and BERT-BiLSTM-CRF models for identifying intent information and entity information in questions, respectively. Finally, the system uses the intent and entity information to make a query in the knowledge graph and provides users with appropriate answers, thus completing the construction of a medical question-answering system
2023, 32(6):121-129. DOI: 10.15888/j.cnki.csa.009108
Abstract:The classical artificial bee colony (ABC) algorithm is also faced with slow convergence speed, and it is easy to fall into local optimality, so there are still many problems in feature selection based on this algorithm. Therefore, a feature selection method based on the rough entropy of granularity and an improved bee colony algorithm, namely FS_GREIABC, is proposed. Firstly, a new information entropy model, namely the rough entropy of granularity, is proposed by combining the knowledge granularity and the rough entropy in the rough set. Secondly, the rough entropy of granularity is applied to the ABC algorithm, and a fitness function based on the rough entropy of granularity is proposed, so as to obtain a new fitness calculation strategy. Thirdly, in order to improve the local search ability of the ABC algorithm, a cloud model is introduced into the following bee stage. Experiments on multiple UCI datasets and software defect prediction datasets show that FS_GREIABC not only selects fewer features but also has better classification performance than the existing feature selection algorithms.
2023, 32(6):130-139. DOI: 10.15888/j.cnki.csa.009069
Abstract:Economic globalization has given logo a huge commercial value, and the development of the computer vision provides a broader application field for logo classification and recognition. This study considers that the overall features of logo images are not significant, and the number of images is large, and then it proposes progressive multi-granularity training of jigsaw patches (PMG-Net), a method of fine-grained image classification, to classify the logo image dataset, so as to improve the ability of the model to classify logo images. The input images containing different granularity information are generated by the jigsaw patch generator, and then the progressive multi-granularity training module is introduced to fuse the features of different granularities. The fused features pay more attention to the subtle differences between images so that the effect of logo image classification is significantly improved. The leaky rectified linear unit (LeakyReLU) activation function is used to retain the negative feature information in the image when the input image features are extracted, and the channel attention mechanism is introduced to adjust the weights of the feature channels, so as to enhance the feature information guiding ability and improve the classification effect of the model. The experimental results show that the classification accuracy of this study on the logo image dataset is better than that of traditional classification methods. This study achieves efficient classification of logo images by incorporating a progressive training strategy with multi-granularity features and a random jigsaw patch generator, which provides a new idea to solve the existing problems in logo image classification.
2023, 32(6):140-148. DOI: 10.15888/j.cnki.csa.009121
Abstract:The rooting algorithms, image segmentation in computer vision, and many problems in machine learning can be regarded as problems seeking solutions to the maximum flow of networks. For more efficient maximum flow algorithms based on hierarchical networks, a maximum flow algorithm based on a memory-aided search strategy is put forward. The traditional Edmonds-Karp algorithm and Dinic’s algorithm suffer from extra overhead due to repeated searches of invalid paths. Hence, a memory-aided search strategy that can record search states is proposed to conquer this problem. Experimental results show that the proposed strategy is efficient and feasible, and the proposed algorithm outperforms Dinic’s algorithm.
2023, 32(6):149-158. DOI: 10.15888/j.cnki.csa.009136
Abstract:On the basis of the discriminative scale space tracking algorithm, the position correction method and Kalman filtering algorithm are applied to pedestrian tracking in this study. Due to deformation and environmental changes, pedestrians cannot be accurately tracked. To solve the problem, this study makes full use of the advantage of the fhog feature in pedestrian tracking and takes the position calculated by the position filter in the discriminative scale space algorithm as the center. It extracts the fhog feature of pedestrians again and correlates it with the position filter template to correct the pedestrian position. Then, the Kalman filtering algorithm is used to predict and correct the corrected pedestrian position again, and finally, a new position filter template is trained in the twice-corrected position. In this study, the pedestrian data set in OTB-100 is selected to test the performance of the method. The experimental results show that in the original algorithm position, the fhog feature is extracted again for correlation operations to correct the position of pedestrians. At the same time, the Kalman filtering predicts and corrects the corrected position, which can improve the positioning accuracy of pedestrians again.
2023, 32(6):159-165. DOI: 10.15888/j.cnki.csa.009142
Abstract:The search ability of the sparrow search algorithm is easy to decline due to insufficient diversity of the initialization population, and the algorithm is easy to fall into local optimal in the late search period. In view of these problems, a multi-strategy fusion sparrow search algorithm (ISSA) is proposed. Specifically, the high-dimensional Sine chaotic mapping is introduced to initialize the population in the algorithm’s initialization stage, so as to improve the quality of the initial population and enhance the diversity of the population. Then, the attenuation factor is introduced in the discoverer stage, and the adaptability of the attenuation factor balances the performance of the early global search and the later local optimization. Finally, the Cauchy mutation and change selection strategy are introduced so that the searching individual can jump out of the local limit to continue the search and enhance the local search ability. Six benchmark test functions are randomly selected, and the experimental results verify that ISSA has been effectively improved compared with the original algorithm in terms of finding the optimal value.
2023, 32(6):166-172. DOI: 10.15888/j.cnki.csa.009126
Abstract:Domain adaptation is a transfer learning algorithm used when the training and test sets do not satisfy the independent homogeneous distribution condition. When the distribution difference between two domains is large, the intra-domain transferability will be reduced, and the existing domain adaptation algorithms need to obtain a large amount of target domain data, which cannot be achieved in some practical applications. In view of the shortcomings of existing domain adaptation methods, the convolutional neural network model is used, and a domain adaptation algorithm based on feature center alignment for few-shot learning is proposed to find domain invariant features, improve the distinguishability of target domain features, and strengthen the classification accuracy. Simulation and experimental results for office-31 public dataset recognition and radar working pattern recognition under small sample conditions show that the proposed method improves the average recognition accuracy of the office-31 dataset by 12.9% compared with the maximum mean discrepancy method, and the radar working pattern recognition accuracy reaches 91%, which is 10% better than the maximum mean discrepancy method.
2023, 32(6):173-180. DOI: 10.15888/j.cnki.csa.009128
Abstract:For the passive location of radiation sources for motion communications in complex environments, the closed-form solution method is sensitive to measurement noise in time-frequency difference models and has a large root-mean-square error of location. To improve the location performance under large observation errors, this study proposes a recursive hybrid TDOA/FDOA location method, which is based on weighted least squares and the genetic algorithm. Firstly, massive time-frequency difference data are observed at known stations, and error models are built. On this basis of the models, multiple sets of time-frequency difference sequences are processed. Secondly, the initial value of the target position is solved by weighted least squares. Given the initial value, the improved genetic algorithm is used to solve and correct the position coordinates through multiple groups of time-frequency difference sequences iteratively and recursively. Finally, position estimation and the frequency difference model are used to estimate the target velocity. The simulations show that the proposed location algorithm has a lower root-mean-square error than the classical two-step weighted least squares method and can maintain high accuracy under large observation errors. Moreover, compared with other hybrid location algorithms, the proposed algorithm boasts a fast convergence speed and can effectively reduce the amount of computation.
2023, 32(6):181-188. DOI: 10.15888/j.cnki.csa.009129
Abstract:There is a serious multipath effect when the radar measures the target at a low elevation angle. The complex position makes the multipath echo produce irregular reflection, which results in different degrees of amplitude and phase distortion. In this study, a perturbational multipath model is introduced to solve the mismatch between the classical multipath model and the multipath echo reflection of the complex positions, and a height measurement method of the synthesized vector maximum likelihood (SVML) based on the perturbational model is studied. Perturbation parameters are introduced to characterize multipath echo phenomena of complex positions and are obtained by the perturbational multipath sparse Bayesian learning (PSBL) algorithm. The obtained parameters are applied to the SVML algorithm, which improves the height measurement performance of VHF radars in complex positions.
2023, 32(6):189-196. DOI: 10.15888/j.cnki.csa.009135
Abstract:The problem of lossy graph summarization is as follows: Given a graph G=(V, E) and a positive integer k, it is required to merge all nodes in graph G into k super nodes so that the resulting summary graph composed of these super nodes can represent the original graph G within a certain error range. As a combinatorial optimization problem based on graph partitioning, this problem is usually solved by randomly extracting node pairs successively and using heuristic methods to select node pairs for merging. This study proposes an effective two-stage algorithm, namely TS_LGS. The algorithm first sets the stage threshold according to the average degree of graph G. Specifically, in the first stage, the current number of super nodes is greater than the stage threshold, and the algorithm selects node pairs among the sampled node pairs in batch for merging based on the current best merging score, so as to effectively reduce the number of iterations; in the second stage, the algorithm preferentially selects adjacent node pairs based on weighted sampling, so as to merge the node pairs with small reconstruction error increment until the number of super nodes is k. The experimental results on several typical real network instances show that TS_LGS can extract graph summarization with lower reconstruction and query errors on the premise of lower time complexity compared with the existing best SAA algorithm.
2023, 32(6):197-203. DOI: 10.15888/j.cnki.csa.009151
Abstract:Monkeypox virus is currently circulating globally and is clinically indistinguishable from other skin diseases, particularly the smallpox virus and chickenpox virus. In the case that deterministic polymerase chain reaction technology and other biological detection technologies are not fully mature, it is a feasible method to detect skin lesions caused by the monkeypox virus by computer-aided diagnostic technology, so a classification algorithm for skin lesions caused by the monkeypox virus based on the residual network is proposed. Based on the residual network, the algorithm combines deep separable convolution and lightweight attention, which reduces the computational amount and complexity of the model and improves the classification performance of the model. The experimental results show that the algorithm shows excellent classification performance for skin lesions caused by the monkeypox virus, and the classification accuracy, recall, and precision of skin lesions caused by the monkeypox virus are 97.3%, 96.8%, and 97.2%, respectively, which are better than those of the common classification models and other research methods used in the experiment.
2023, 32(6):204-211. DOI: 10.15888/j.cnki.csa.009149
Abstract:To address the low accuracy of underwater fish target detection caused by blurred and color-distorted underwater images, complex underwater scenes, and limited target feature extraction ability, this study proposes an improved underwater fish target detection algorithm based on YOLOv5. Firstly, in response to the blurring and color distortion of underwater images, the underwater dark channel prior (UDCP) algorithm is introduced to pre-process the images, which is helpful for correctly identifying the target in different environments. Then, considering the problems of complex underwater scenes and limited target feature extraction ability, the study introduces an efficient correlation channel, i.e., efficient channel attention (ECA), into the YOLOv5 network to enhance the feature extraction ability of the target. Finally, the loss function is improved to enhance the accuracy of the target detection box. Experiments show that the accuracy of the improved YOLOv5 in underwater fish target detection is 2.95% higher than that of the original YOLOv5, and the average detection accuracy (mAP@0.5:0.95) is increased by 5.52%.
2023, 32(6):212-220. DOI: 10.15888/j.cnki.csa.008936
Abstract:Education is an important enabler for achieving sustainable development goals (SDGs). Artificial intelligence (AI) is a booming technology, and people are showing increasing interests in understanding students’ behavior and evaluating their performance. For the SDGs, AI has great potential to improve education as it has started to be developed in the education field with innovative teaching methods to create better learning. This study presents an artificial intelligence-based analytic tool for predicting the performance of students in a first-year information technology course at a university. A random forest-based classification model is built to predict students’ performance in Week 6, and the model reports the accuracy of 97.03%, sensitivity of 95.26%, specificity of 98.8%, precision of 98.86%, and the Mathews correlation coefficient of 94%. The result demonstrates that this method is useful in predicting the early performance of students in courses. During the COVID-19 pandemic, experimental results showed that the proposed prediction model met the accuracy, precision, and recall required to predict elements of students’ learning behavior in a virtual education system.
2023, 32(6):221-230. DOI: 10.15888/j.cnki.csa.009113
Abstract:In order to address the problem of low accuracy of skin melanoma lesion segmentation in existing image segmentation methods, a MultiResUNet-SMIS method is proposed based on existing convolution neural network methods. Firstly, according to the imaging characteristics of skin melanoma, the dilation convolution with different dilation rates is introduced to replace the normal convolution, and the receptive field is expanded on the premise of the same parameters so that the model can segment the lesion at multiple scales. Secondly, spatial and channel attention mechanisms are added to the model to redistribute feature weights, expand the influence of features of interest, and suppress irrelevant features. Finally, by combining Focal loss with Dice loss, a new loss function, i.e., FD loss, is proposed to calculate the regression loss and solve the problem of unbalanced foreground and background pixels, so as to further improve the segmentation accuracy of the network model. The experimental results show that Dice, IoU, and Acc of MultiResUNet-SMIS on the ISIC-2018 dataset have reached 89.47%, 82.67%, and 96.13%, respectively, which are better than the original MultiResUNet and mainstream methods such as UNet, UNet++, and DeepLab V3+ in skin melanoma image segmentation.
2023, 32(6):231-240. DOI: 10.15888/j.cnki.csa.009143
Abstract:In the field of photovoltaic panel defect classification, since traditional defect classification methods and emerging machine learning methods have limitations, which fail to meet the requirements for such classification, more reliable solutions are urgently needed. In recent years, few-shot learning, which can quickly learn from limited data and be generalized to new tasks, has gradually sprung up in various fields, bringing new ideas to the optimization of defect technology. Based on a typical few-shot learning method, the prototypical network method, this study proposes an improved prototypical network-based defect classification method for photovoltaic panels. By complicating the model backbone network, improving the model training mode and adjusting the similarity measurement standard, this method can effectively solve the problems of the poor feature embedding ability and general classification effect of the prototypical network for complex samples. The method has been verified by several comparative experiments on a classic photovoltaic panel defect data set. The results show that the experimental time of the improved method is greatly shortened and the model accuracy is improved.
2023, 32(6):241-250. DOI: 10.15888/j.cnki.csa.009117
Abstract:Anchor-free-based detection methods have been proposed successively in recent years, and they transform objects into key points and assign labels to positive and negative samples in the global Gaussian heatmap. This label assignment strategy suffers from positive and negative sample imbalance in some scenarios and cannot effectively reflect the shape and orientation of the object in parathyroid detection. Therefore, a new parathyroid detection model, namely, EllipseNet, is proposed in this study, which first constructs an elliptical Gaussian distribution in GT to fit the real object in GT, so as to make the assignment of positive and negative samples more fine-grained. Furthermore, a loss function incorporating the object shape information is proposed to constrain the position of the object, so as to improve the accuracy of detection. In addition, multi-scale prediction is constructed in the model, which can better detect objects of different sizes and solve the problem of target scale imbalance in parathyroid detection. In this study, experiments are conducted on the parathyroid dataset, and the results show that EllipseNet achieves an AP50 of 95%, which is a large improvement in detection accuracy compared with a variety of mainstream detection algorithms.
2023, 32(6):251-259. DOI: 10.15888/j.cnki.csa.009165
Abstract:The existing traditional semantic segmentation methods of crop diseases have low accuracy and poor robustness. In order to address these problems, an improved UNet semantic segmentation model of strawberry diseases based on an attention mechanism is proposed. Firstly, a CNN-Transformer hybrid structure is added to the encoder to improve the feature extraction ability of global information and local detail information. Secondly, the traditional up-sampling is replaced by a dual up-sample module in the decoder to enhance the feature extraction ability and segmentation accuracy. Thirdly, the hard-swish activation function is employed to replace the ReLU activation function, and the smoother curve helps to improve generalization and nonlinear feature extraction ability and prevent gradient disappearance. Finally, the segmentation accuracy is further improved by using a combined cross-entropy Dice loss function to strengthen the model’s constraints on the segmentation results. A dataset consisting of 2 500 images of seven strawberry diseases is used to segment strawberry diseases in a complex background. The semantic segmentation pixel accuracy reaches 92.56%, and the average cross-merge ratio reaches 84.97%. The experimental results show that the improved UNet in this study can achieve better segmentation results and outperform most segmentation models in the semantic segmentation of strawberry diseases.
2023, 32(6):260-268. DOI: 10.15888/j.cnki.csa.009080
Abstract:During the classification of Alzheimer’s disease, the hypergraph neural network (HGNN) can extract features from the hypergraph relationship between subjects, which has a good advantage in representing and learning the structure of complex graphs. However, most models directly or indirectly decompose the higher-order complex relationship between subjects represented by hypergraphs into the simple binary relationship for feature learning, without effectively using the higher-order information of hyperedges. Therefore, an Alzheimer’s disease classification model based on the line-hypergraph neural network (L-HGNN) is proposed. The model uses sparse linear regression to represent the multiple correlations between subjects. With the help of the transformation of hypergraphs and line graphs, the higher-order neighborhood information transmission of nodes and the learning of overall structural features of hyperedges are realized in convolutional network models. Meanwhile, a more differentiated node embedding is generated by the attention mechanism, which is then used in the auxiliary diagnosis of Alzheimer’s disease. Compared with the results of two commonly used methods on the ADNI dataset, the experimental results show that the proposed method can effectively improve the classification accuracy and has important application value in the early diagnosis of Alzheimer’s disease.
2023, 32(6):269-277. DOI: 10.15888/j.cnki.csa.009124
Abstract:Formal methods are required for the automatic generation of codes to ensure that the code generated by the compiler can be applied to nuclear power instrument and control systems and thus minimize the errors introduced by the compiler during the compilation of synchronous data-flow languages. This study uses the theorem proving tool Coq to formally define the syntax, semantics, and translation algorithms involved in the translation phase of the master-node input structure of the synchronous data-flow language from Lustre to Clight and completes the formal proof of the translation algorithm. It is shown that this formalized compiler can generate credible target code that is consistent with the behavior of the source code, and meanwhile, the generated target code can well satisfy the implementation specifications of nuclear power instrument and control systems.
2023, 32(6):278-285. DOI: 10.15888/j.cnki.csa.009162
Abstract:For limited equipment resources in industrial scenarios, a lightweight strip steel defect detection model based on improved YOLOv5 is proposed. First, ShuffleNetv2 is used to replace the backbone feature extraction network to optimize model parameter amount and running speed; secondly, the lightweight up-sampling operator, namely content-aware reassembly of features (CARAFE) is used to further reduce parameters and calculation amount while increasing the receptive field. At the same time, the GSConv layer is introduced to balance the model accuracy and detection speed while ensuring semantic information. Finally, a cross-level feature fusion mechanism is designed to improve the detection accuracy of the network. The experimental results show that the mean average precision of the improved model is 78.5%, which is 1.4% higher than the original YOLOv5 algorithm. The calculation amount of the model is 10.9 GFLOPs; the parameter amount is 5.88×106; the calculation and parameter amounts are reduced by 31% and 15.4%, respectively; the detection speed is 49 f/s, which is increased by 3.5 f/s. Therefore, the improved model improves the detection accuracy and speed and greatly reduces the calculation and parameter amounts of the model, which can ensure the real-time detection of surface defects of strip steel.
2023, 32(6):286-292. DOI: 10.15888/j.cnki.csa.009148
Abstract:Multi-label image classification is a research hotspot in multi-label data classification. The existing multi-label image classification methods only learn the visual representation features of images and ignore the relevant information between image labels and the correspondence between label semantics and image features. In order to solve these problems, a multi-label image classification model based on a multi-head graph attention network and graph model (ML-M-GAT) is proposed. By using label co-occurrence and attribute information, the model builds a graph model, and it employs the multi-head attention mechanism to learn the attention weight of the label. In addition, the model utilizes label weights to fuse label semantic features and image features, so as to integrate label correlation and label semantic information into the multi-label image classification model. In order to verify the effectiveness of the proposed model, experiments are carried out on the public datasets VOC-2007 and COCO-2014, and the experimental results show that the average mean accuracy (mAP) of the ML-M-GAT model on the two datasets is 94% and 82.2%, respectively, which are better than that of CNN-RNN, ResNet101, MLIR, and MIC-FLC models and are 4.2% and 3.9% higher than that of ResNet101 models, respectively. Therefore, the proposed model can improve the performance of multi-label image classification by using image label information.
2023, 32(6):293-300. DOI: 10.15888/j.cnki.csa.009150
Abstract:High-quality question-answering plays an important role in human activities and artificial intelligence because it can help to obtain knowledge from articles, improve the performance of question-answering systems, and promote machine reading comprehension. The current mainstream question-answer pair generation methods usually rely on candidate answers in the provided article to generate specific questions based on these answers. However, some candidate answers may generate questions that cannot be answered from the article, or the answers to the generated questions are no longer the same as the candidate answers, which thus results in a poor correlation of the question-answer pairs and affects the quality of the question-answer pairs. In order to solve these problems, this study proposes a method to generate question-answer pairs based on key phrase extraction and filtering. The method can automatically extract key phrases suitable for generating questions from the input text as the candidate answers and then generate question-answer pairs by a question generator and an answer generator according to the candidate answers. Finally, the method outputs question-answer pairs with high quality by comparing the similarity between the candidate answers and the generated answers and filtering out those question-answer pairs that have a low correlation with the candidate answers. The proposed method is evaluated by experiments on SQUAD1.1 and NewsQA datasets, and the quality of generated question-answer pairs is manually checked. The results show that this method can effectively improve the quality of generated question-answer pairs.
2023, 32(6):301-307. DOI: 10.15888/j.cnki.csa.009091
Abstract:Heavy pollution weather is the key target of air pollution control during the 14th Five-Year Plan period. Accurate identification of risk sources during the period of heavy pollution weather paves the way for early warning in time, effective environmental pollution control, and prevention of the further aggravation of pollution incidents. On the basis of the data obtained by grid monitoring technology, this study proposes a deep learning model combining the residual network (ResNet), graph convolutional network (GCN), and gated recurrent unit (GRU) network, i.e., the ResGCN-GRU. This model is mainly used to identify risk sources during the period of heavy pollution weather. The risk sources of such weather are often regional and have salient spatiotemporal features. Therefore, this study starts by extracting the spatial features among monitoring points with the GCN and solving the problems of over-smoothing and gradient disappearance caused by the multi-layer GCN with the ResNet. Then, the GRU is used to extract the temporal features of risk sources. Finally, the spatiotemporal features fused by the fully connected layer are input into the Softmax activation function to obtain the binary classification probability, which is further used to obtain the classification result. To verify the performance of the proposed model, this study analyzes the data of 72 monitoring points in Shenyang and compares GCN, long short-term memory (LSTM), GRU, and GCN-GRU in accuracy, recall rate, and comprehensive evaluation indicators. The experimental results show that the classification accuracy of the ResGCN-GRU model is 16.9%, 4.3%, 3.1%, and 2.9% higher than that of the above four models, respectively, which proves that the model proposed in this study is more effective in identifying air risk sources, and it can accurately identify risk sources according to the spatiotemporal features of risk source data.