• Current Issue
  • Online First
  • Archive
  • Click Rank
  • Most Downloaded
  • 综述文章
    Article Search
    Search by issue
    Select AllDeselectExport
    Display Method:
    2024,33(7):1-13, DOI: 10.15888/j.cnki.csa.009580
    [Abstract] (165) [HTML] (22) [PDF 4.34 M] (285)
    Abstract:
    Long-term object tracking remains a formidable challenge compared to short-term object tracking. However, existing long-term tracking algorithms often perform poorly when faced with challenges such as targets frequently appearing and disappearing, and drastic changes in target appearance. This study proposes a novel, robust, and real-time long-term tracking framework based on local search modules and global search tracking modules. The local search module utilizes the TransT short-term tracker to generate a series of candidate boxes, and the best candidate box is determined through confidence scoring. A novel global search tracking module is developed for global re-detection, based on the Faster R-CNN model, with the introduction of Non-Local operations and multi-level instance feature fusion modules in the RPN and R-CNN stages, aiming to fully exploit target instance-level features. To improve the performance of the global search tracking module, a dual-template update strategy is designed to enhance the robustness of the tracker. By utilizing templates updated at different time points, the tracker can better adapt to target changes. The target presence is determined based on local or global confidence scores, and the local or global search tracking strategy is selected in the next frame. Additionally, the local search module is capable of estimating the position and size of the target. Moreover, a ranking loss function is introduced for the global search tracker, implicitly learning the similarity between region proposals and the original query target. A large number of experiments are conducted on multiple tracking datasets to comprehensively assess the proposed tracking framework. The results consistently demonstrate that the proposed tracking framework achieves satisfactory performance.
    2024,33(7):14-25, DOI: 10.15888/j.cnki.csa.009585
    [Abstract] (124) [HTML] (17) [PDF 2.68 M] (330)
    Abstract:
    The uncertainty of neural networks reflects the predictive confidence of deep learning models, enabling timely human intervention in unreliable decision-making, which is crucial for enhancing system safety. However, existing measurement methods often require significant modifications to the model or training process, leading to high implementation complexity. To address this, this study proposes an uncertainty measurement approach utilizing neuron statistical modeling and analysis with activation values within a single forward propagation. An improved kernel density estimation technology is employed to construct neuron activation distributions and stimulate neuron normal operating range. Subsequently, a neighborhood-weighted density estimation method is utilized to calculate anomaly factors, effectively qualifying deviations of test samples from neuron activation distribution. Finally, by statistically combining the anomaly factors of each neuron, the cumulative anomaly factors of the sample provide a new perspective in assessing model uncertainty. Experimental results across multiple public datasets and models visually demonstrate the significant effectiveness of the proposed method in distinguishing between in-domain and out-of-domain samples through visualizing feature maps. Moreover, the method exhibits exceptional performance in out-of-domain detection tasks, with AUROC exceeding other methods across various experimental setups, validating its generality and effectiveness.
    2024,33(7):26-38, DOI: 10.15888/j.cnki.csa.009547
    [Abstract] (161) [HTML] (30) [PDF 2.97 M] (344)
    Abstract:
    Hierarchical federated learning (HFL) aims to optimize model performance and maintain data privacy through multi-layered collaborative learning. However, its effectiveness relies on effective incentive mechanisms for participating parties and strategies to address information asymmetry. To address these issues, this study proposes a layered incentive mechanism for protecting the privacy of end devices, edge servers, and cloud servers. At the edge-device layer, edge servers act as intermediaries, using the multi-dimensional contract theory to design a variety of contract items. This encourages end devices to participate in HFL using local data without disclosing the costs of data collection, model training, and model transmission. At the cloud-edge layer, the Stackelberg game models the relationship between unit data reward and data size between a cloud server and edge servers and subsequently transforms it into a Markov process, all while maintaining the confidentiality of the edge servers’ unit profit. Then, multi-agent deep reinforcement learning (MADRL) is used to incrementally approach the Stackelberg equilibrium (SE) while ensuring privacy. Experimental results indicate that the proposed incentive mechanism outperforms traditional approaches, yielding an almost 11% increase in cloud server revenue and an approximately 18 times improvement in the cost-effectiveness gained.
    2024,33(7):39-51, DOI: 10.15888/j.cnki.csa.009555
    [Abstract] (169) [HTML] (28) [PDF 4.39 M] (297)
    Abstract:
    The current image denoising algorithms based on deep learning are unable to consider the local and global feature information comprehensively, which in turn affects the image denoising effect at the details. To address this problem, this study proposes a hybrid CNN and Transformer image denoising network (HCT-Net). First, CNN and Transformer coupling block (CTB) is proposed to construct a two-branch structure that integrates convolution and channel self-attention to alleviate the high computational overhead caused by relying solely on the Transformer. At the same time, the attention weights are dynamically allocated so that the network focuses on important feature information. Secondly, the self-attention enhanced convolution module (SAConv) is designed to adopt the progressive combination of modules and nonlinear transformations to attenuate the noise signal interference and identify local features under complex noise levels. Experimental results on six benchmark datasets show that HCT-Net has better feature perception ability than some current advanced denoising methods and can suppress high-frequency noise signals to recover the edge and detail information of images.
    2024,33(7):52-62, DOI: 10.15888/j.cnki.csa.009523
    [Abstract] (165) [HTML] (25) [PDF 3.42 M] (274)
    Abstract:
    As a very challenging project in target detection, small target detection is widely distributed in daily life. In video surveillance scenarios, pedestrians’ faces about 20 meters away from the camera can be considered small targets. Due to the possibility of mutual occlusion of faces and their susceptibility to noise and weather, lighting conditions, the performance of existing target detection models on such small targets is inferior to that on medium and large targets. To address these issues, this study proposes an improved YOLOv7 model with a high-resolution detection head and transforms the backbone network based on GhostNetV2. At the same time, the PANet structure is replaced by the BiFPN and SA attention modules combined to enhance the multi-scale feature fusion capability; the original CIoU loss function is improved by combining the Wasserstein distance, reducing the sensitivity of small targets to anchor frame position offset. This study conducts comparative experiments on the public dataset VisDrone2019 and a self-made video surveillance dataset. Results show that the mAP of the improved method proposed in this study improved to 50.1% on the VisDrone2019 dataset and is 1.6 percentage points higher than existing methods on the self-made video surveillance dataset, which effectively improves the ability of small target detection and achieves good real-time performance on the GTX1080Ti.
    2024,33(7):63-73, DOI: 10.15888/j.cnki.csa.009588
    [Abstract] (183) [HTML] (24) [PDF 1.86 M] (335)
    Abstract:
    In semantic segmentation tasks, the downsampling process of the encoder can lead to a decrease in resolution, resulting in the loss of spatial information details in the image. As a result, segmentation discontinuity or incorrect segmentation may occur at object edges, which can damage overall segmentation performance. To address the above issues, an image semantic segmentation model EASSNet based on edge features and attention mechanisms is proposed. Firstly, the edge detection operator is used to calculate the edge map of the original image, and edge features are extracted through pooling downsampling and convolution operations. Next, edge features are fused into deep semantic features extracted by the encoder, restoring the spatial detail information of downsampled feature images, and strengthening meaningful information through attention mechanisms to improve the accuracy of object edge segmentation and overall semantic segmentation performance. Finally, EASSNet achieves the average intersection over the union of 85.9% and 76.7% on the PASCAL VOC 2012 and Cityscapes datasets, respectively. Compared with current popular semantic segmentation networks, EASSNet has significant advantages in overall segmentation performance and object edge segmentation.
    Article Search
    Search by issue
    Select AllDeselectExport
    Display Method:
    Available online:  July 03, 2024 , DOI: 10.15888/j.cnki.csa.009591
    Abstract:
    Ancient Chinese texts are rich in historical and cultural information. Studying entity relationship extraction of such texts and constructing related knowledge graphs play an important role in cultural inheritance. Given the large number of rare Chinese characters, semantic fuzziness, and ambiguity in ancient Chinese texts, the entity relation joint extraction model based on the BERT-ancient-Chinese pre-trained model (JEBAC) is proposed. First of all, BERT-ancient-Chinese pre-trained model integrates the BiLSTM neural network and attention mechanism (BACBA), identifies all subject and object entities in sentences, and provides a basis for joint extraction of relation and object entities. Next, the normalized coding vector of the subject entity is added to the embedding vector of the whole sentence to better understand the semantic features of the subject entity in the sentence. Finally, combined with the sentence vector with the characteristics of the subject entity and the prompt information of the object entity, the relationship and object entity in the sentence are jointly extracted by BACBA to obtain all triple information (subject entity, relationship, and object entity) in the sentence. The performance of Chinese entity relation extraction DuIE2.0 datasets and the classical Chinese entity relation extraction C-CLUE small sample datasets of CCKS 2021 are compared with that of the existing methods. Experimental results show that the proposed method is more effective in extraction performance, with F1 values up to 79.2% and 55.5%, respectively.
    Available online:  July 03, 2024 , DOI: 10.15888/j.cnki.csa.009599
    Abstract:
    With the development of GPS positioning technology and mobile Internet, various location-based services (LBS) applications have accumulated a large amount of spatio-textual data with location and text markup. These data are widely used in location selection decision-making scenarios such as marketing and urban planning. The goal of spatio-textual location selection is to mine the optimal locations from a given candidate set to build new facilities to influence the largest number of spatio-textual objects, such as people or vehicles, where the closer the spatial location and the more similar the text, the greater the influence. However, existing solutions not only fail to consider prevalent peer competition in real life but also ignore user evaluation factors for facilities. To make more reasonable location selection decisions in a peer competition environment combined with user ratings, this study proposes a more rational spatio-textual location selection problem, CoSTUR. To solve the limitation in traditional models where objects can only be influenced by a single facility, a threshold that makes a trade-off between the certainty and quantity of facility influence on objects is introduced, which also models the real-world situation in which multiple facilities could simultaneously influence a specific user. Based on the classical competitive equalization model, quantification of competition among facilities with different ratings is achieved. To reduce the high computational cost for large volumes of data, a novel spatio-textual index structure, TaR-tree, is constructed and two pruning strategies based on influence range are designed with a combination of thresholds to achieve two branch-and-bound solutions for spatial connectivity and range queries. Experimental results on real and synthetic datasets demonstrate that the computational efficiency can be improved by nearly one order of magnitude compared to baseline algorithms, verifying the effectiveness of the proposed method.
    Available online:  July 03, 2024 , DOI: 10.15888/j.cnki.csa.009601
    Abstract:
    Sparse mobile crowdsensing (MCS) is an emerging paradigm that collects data from a subset of sensing areas and then infers data from other areas. However, there is a shortage or uneven distribution of workers when sparse MCS is applied. Therefore, with a limited budget, it is important to prioritize the involvement of the more important workers in data collection. Additionally, many sparse MCS applications require timely data. Consequently, this study considers data freshness, with age of information (AoI) serving as a freshness metric. To address these challenges, a simplified AoI-aware sensing and inference (SASI) framework is proposed in this study. This framework aims to optimize AoI and inference accuracy by selecting suitable workers for data collection under budget constraints and accurately capturing spatiotemporal relationships in sensed data for inference. Moreover, limited budgets and worker availability may result in a reduced volume of data. Thus, methods for streamlining data inference models are also proposed to enhance inference efficiency. Experiments have substantiated the superiority of this framework in practice.
    Available online:  July 03, 2024 , DOI: 10.15888/j.cnki.csa.009590
    Abstract:
    The knowledge tracing task aims to accurately track students’ knowledge status in real time and predict students’ future performance by analyzing their historical learning data. This study proposes a deep memory network knowledge tracing model incorporating knowledge point-relationships (HRGKT) to address the problem that current research has neglected complex higher-order relationships in the knowledge points covered by the questions. Firstly, HRGKT uses the knowledge point relationship graph to define the relationship information between nodes in the graph, which represents the rich information between knowledge points. GAT is used to obtain higher-order relationships between them. Then, forgetting exists in the learning process, and HRGKT considers four factors affecting knowledge forgetting to track students’ knowledge status more accurately. Finally, based on the experimental comparison results on real online education datasets, HRGKT performs more accurately in tracing students’ knowledge mastery status and has better prediction performance than current knowledge tracing models.
    Available online:  July 03, 2024 , DOI: 10.15888/j.cnki.csa.009606
    Abstract:
    Implicit feedback data plays a crucial role in recommender systems, but it often suffers from sparsity and biases, including exposure bias and conformity bias. Existing debiasing methods tend to address only one type of bias, which can impact personalized recommendation effectiveness, or require a expensive debiased dataset as auxiliary information for multiple debiasing. To address this issue, a collaborative filtering recommendation algorithm specifically designed for sparse implicit feedback data, which can simultaneously debias exposure bias and conformity bias, is proposed. The algorithm utilizes the proposed dual inverse propensity weighting method and a contrastive learning auxiliary task to remove the two biases contained in the implicit feedback data which are input into dual-tower autoencoders so that the complete algorithm can estimate users’ preference probability to items. Experimental results demonstrate that the proposed algorithm outperforms comparative algorithms in terms of normalized discounted cumulative gain (NDCG@K), mean average precision (MAP@K), and recall (Recall@K) on publicly available debiased datasets such as Coat and Yahoo!R3.
    Available online:  July 03, 2024 , DOI: 10.15888/j.cnki.csa.009613
    Abstract:
    In computer vision segmentation, the Transformer-based image segmentation model needs a large amount of image data to achieve the best performance. However, the data volume of medical images is very scarce compared with natural images. Convolution, with its higher inductive bias, is more suitable for medical images. To combine the long-range representation learning of Transformer with the inductive bias of CNN, a residual ConvNeXt module is designed to simulate the design structure of Transformer in this research. The module, composed of deep convolution and point wise convolution, is used to extract feature information, which greatly reduces the number of parameters. The receptive field and feature channel are effectively scaled and expanded to enrich the feature information. In addition, an asymmetric 3D U-shaped network called ASUNet is proposed for the segmentation of brain tumor images. In the asymmetric U-shaped structure, the output features of the last two encoders are connected by residual connection to expand the number of channels. Finally, deep supervision is used in the process of upsampling, which promotes the recovery of semantic information. Experimental results on the BraTS 2020 and FeTS 2021 datasets show that the dice scores of ET, WT, and TC reach 77.08%, 90.83%, 83.41%, and 75.63%, 90.45, 84.21%, respectively. Comparative experiments show that ASUNet can fully compete with Transformer-based models in terms of accuracy while maintaining the simplicity and efficiency of standard convolutional neural networks.
    Available online:  July 03, 2024 , DOI: 10.15888/j.cnki.csa.009629
    Abstract:
    To solve the flow shop scheduling problem with limited buffers and machine processing gears (FSSP_LBMPG), this research establishes a mathematical programming model for green flow shops with limited buffers. The model has two objective functions: the minimized values of maximum completion time and processing energy consumption. With buffer capacity as a constraint, the processing speed and energy consumption are coordinated by reasonably selecting machine processing gears. Based on the characteristics of the problem model, an improved dandelion optimization algorithm (IDOA) is proposed. The algorithm first designs a DOA double-layer real-valued encoding mechanism to represent the solution to the problem according to the characteristics of the scheduling problem. By introducing an initialization mechanism, the quality and efficiency of the initial solution are improved. During algorithm iteration, a real-valued crossover strategy and a variable neighborhood search strategy are designed to compensate for the poor local search ability of the original dandelion algorithm and enhance the development capabilities of the improved algorithm. Comparative experiments on designed cases show that the proposed improved algorithm effectively enhances the performance of the original algorithm, thereby verifying the effectiveness and robustness of the improved algorithm.
    Available online:  July 03, 2024 , DOI: 10.15888/j.cnki.csa.009577
    Abstract:
    In recent years, image segmentation applications based on convolutional neural networks (CNNs) have been quite extensive, and great progress has been made in feature extraction. However, with convolutional layers increasingly deep, the receptive field is continually enlarged, which makes the model lose local feature information and affects model performance. Using graph convolution network (GCN) to process information on graph data structures preserves local features without losing local information as the layers deepen. This study focuses on combining U-Net (a kind of symmetric full convolutional networks) feature extraction based on CNN structure with GCN-based image segmentation to extract global and local, shallow, and deep multi-scale feature sets for multimodal glioma MR sequence image segmentation. The process can be divided into two stages. Firstly, U-Net is used to extract features from brain multimodal glioma MR sequence images, and multiple pooling layers are used to realize multi-scale feature extraction and up-sampling for feature fusion, in which the bottom layer outputs lower-level features, and the top layer outputs more abstract high-level features. Secondly, the feature map data obtained by U-Net is converted into the graph structure data required by GCN by dilating neighborhood and sparsification, and the image segmentation problem is converted into the graph node classification problem. Lastly, the graph structure data is classified by cosine similarity. Experimental results achieved segmentation accuracy of 0.996 and sensitivity of 0.892 on the BraTS 2018 public database. Compared with other deep learning models, this method, by fusing multi-scale features and using GCN to establish topological connections between high and low level features, ensures that local information is not lost to achieve better segmentation results, which meets the needs of analyzing clinical glioma MR images, and then effectively improves the diagnostic accuracy of gliomas.
    Available online:  July 03, 2024 , DOI: 10.15888/j.cnki.csa.009586
    Abstract:
    While natural language generation (NLG)-based large language models, represented by ChatGPT, perform well in various natural language processing tasks, their performance in sequence recognition tasks, such as named entity recognition, is somewhat inferior to that of bidirectional encoder representations from Transformer (BERT)-based deep learning models. To address this issue, this study first transforms the existing Chinese named entity recognition problem into a machine reading comprehension problem. A new name entity recognition method based on in-context learning and fine tuning is proposed, thereby enabling NLG-based language models to achieve good results in named entity recognition without changing base model pre-training parameters. Additionally, since named entities are generated by the model rather than classified from original data, there are no boundary issues. To verify the effectiveness of the new framework on named entity recognition tasks, experiments are conducted on some Chinese named entity recognition datasets. On the Resume and Weibo datasets, the F1 scores reach 96.04% and 67.87% respectively, a gain of 0.4 and 2.7 percentage points over the state-of-the-art models, confirming that the new framework can effectively utilize the text generation advantages of NLG-based language models to complete named entity recognition tasks.
    Available online:  July 03, 2024 , DOI: 10.15888/j.cnki.csa.009535
    Abstract:
    Considering the characteristics of the adjacent container terminals in the same region, such as similar logistics functions, overlapping cargo hinterlands, severe disorderly competition, and low resource utilization rates, this study focuses on the problem of multiple container terminal tactical berth and yard incorporate integrative scheduling (MCT-TBY-IIS), where the terminals are managed by the same organization and located adjacent to each other. Based on computational logistics, the MCT-TBY-IIS problem is decomposed into two subproblems of moderate coupling: the multi-terminal dynamic and continuous berth allocation problem (MDC-BAP) and the multi-terminal periodic and rolling yard allocation problem (MPR-YAP). This decomposition is achieved by using the multiple knapsack problem, as well as considering berth depth constraints and export containers with transferable terminal options. Subsequently, the hierarchical nesting-oriented two-stage improved imperialist competitive algorithm (HNO-TSI-ICA) is customized to optimize MCT-TBY-IIS under the guidance of problem-oriented exploration. Lastly, with typical examples of multi-terminal joint operations in the southeast coastal region in China, a combination of two algorithms is selected and applied to HNO-TSI-ICA for solving the MCT-TBY-IIS problem: the prosperity and destruction-oriented improved imperialist competitive algorithm with double assimilation, and the binary imperialist competitive algorithm for the 0-1 knapsack problem. Moreover, the structure of the target cost of the storage yard operation subsystem is stable and not affected by the port load or the length of the planning period. Notably, the horizontal transportation cost of containers in the export container area makes the largest contribution to the sub-target cost of storage yard operations, maintaining a stable proportion of 83%. Through the modeling and optimization of MCT-TBY-IIS, it is found that the multi-terminal cooperative operation mode has great potential to help the neighboring multiple terminals in the same organization reduce costs, increase efficiency, and improve the utilization rate of core resources.
    Available online:  July 03, 2024 , DOI: 10.15888/j.cnki.csa.009558
    Abstract:
    Cross-project defect prediction (CPDP) has emerged as a crucial research area in software engineering and data mining. Using defective code from other data-rich projects to build prediction models solves the problem of insufficient data during model construction. However, the distribution difference between the code files of source and target projects results in poor cross-project prediction. Most studies adopt the domain adaptation methods to solve this problem, but the existing methods only focus on the influence of conditional or marginal distribution on domain adaptation, ignoring its dynamics. On the other hand, they fail to choose appropriate pseudo-labels. Based on the above two aspects, this study proposes a cross-project defect prediction method based on dynamic distribution alignment and pseudo-label learning (DPLD). Specifically, the proposed method reduces the marginal and conditional distribution differences between projects in the domain alignment and category alignment modules, respectively, by means of the adversarial domain adaptation method. Additionally, it dynamically and quantitatively characterizes the relative importance of the two distributions using dynamic distribution factors. Furthermore, this study proposes a pseudo-label learning method to enhance the accuracy of pseudo-labels as real labels through the geometric similarity between data. Experiments conducted on the PROMISE dataset show that DPLD achieves average improvements of 22.98% and 15.21% in terms of F-measure and AUC, respectively. These results demonstrate the effectiveness of the DPLD method in reducing distribution differences between projects and improving the performance of cross-project defect prediction.
    Available online:  June 28, 2024 , DOI: 10.15888/j.cnki.csa.009571
    Abstract:
    Due to the small inter-class differences and large intra-class differences of fine-grained images, the key to fine-grained image classification tasks is to find subtle differences between categories. Recently, Vision Transformer-based networks mostly focus on mining the most prominent discriminative region features in images. There are two problems with this. Firstly, the network ignores mining classification clues from other discriminative regions, which can easily confuse similar categories. secondly, the structural relationships of images are ignored, resulting in inaccurate extraction of category features. To solve the above problems, this study proposes two modules: dynamic adaptive modulation and structural relationship learning. The dynamic adaptive modulation module forces the network to search for multiple discriminative regions, and then the structural relationship learning module is used to construct structural relationships between discriminative regions. Finally, the graph convolutional network is used to fuse semantic and structural information to obtain predicted classification results. The proposed method achieves testing accuracy of 92.9% and 93.0% on the CUB-200-2011 dataset and NA-Birds dataset, respectively, which is superior to existing state-of-the-art networks.
    Available online:  June 28, 2024 , DOI: 10.15888/j.cnki.csa.009572
    Abstract:
    The original artificial fish swarms algorithm (AFSA) has weak global search ability and poor robustness and is easy to fall into local extremum. Given these problems, an adaptive and differential mutation artificial fish swarm algorithm (ADMAFSA) is proposed. Firstly, it utilizes an adaptive vision field and step length strategy to improve the fine search ability of individuals in better areas of the population and enhance the optimization accuracy of the algorithm. Secondly, to explore potential better areas, the opposition-based learning mechanism is introduced into the random behavior of artificial fish swarms. Thereby, the algorithm can get better global searching ability and avoid premature convergence. Finally, inspired by the differential evolution algorithm, a mutation operation is applied to poorly performing artificial fish to increase the diversity of the fish swarm and reduce the possibility of the algorithm falling into the local extremum. To validate the performance of the improved algorithm, the proposed algorithm is tested with six benchmark test functions and eight CEC2019 functions. The experimental results indicate that, compared to other AFSA variants and novel intelligent algorithms, ADMAFSA demonstrates improvements in terms of optimization accuracy and robustness. Furthermore, in designing the train of gears, the optimization effectiveness of the improved algorithm is further proved.
    Available online:  June 28, 2024 , DOI: 10.15888/j.cnki.csa.009573
    Abstract:
    The lattice Boltzmann method (LBM) is a computational fluid dynamics (CFD) method based on molecular motion theory. Improving the parallel computing capability of LBM is an important research topic in the high-performance computing field. This article is based on the SW26010Pro processor and achieves multi-level parallelism of LBM through optimization methods such as region decomposition, data reconstruction, double buffering, and vectorization. Based on the above optimization methods, a grid size of 56 million is tested, and the implementation results show that compared to message passing interface (MPI) level parallelism, the average acceleration factor of the collision process reaches 61.737, and that of the migration process reaches 17.3. At the same time, strong expansion testing is conducted on the lid-driven cavity flow case, with a grid size of 1200×1200×1200. Based on 62 000 computing cores, the parallel efficiency of one million cores exceeds 60.5%.
    Available online:  June 28, 2024 , DOI: 10.15888/j.cnki.csa.009574
    Abstract:
    As point cloud acquisition technology develops and the demand for 3D applications increases, real-world scenarios require continuous and dynamic updating of the point cloud analysis network with streaming data. This study proposes a dual feature enhancement for the class-incremental 3D point cloud object learning method, which adapts point cloud object classification to scenarios where new category objects keep emerging in newly acquired data through incremental learning. This method proposes a discriminative local enhancement module and knowledge injection network respectively to alleviate new class bias problems in class-incremental learning by studying the characteristics of point cloud data and old class information. Specifically, the discriminative local enhancement module characterizes the various local structural characteristics of 3D point cloud objects by perceiving expressive local features. Subsequently, the importance weights of each local structure are obtained based on the global information of each local structure, enhancing the perception of differential local features and improving the differentiation of new and old class features. Furthermore, the knowledge injection network injects old knowledge from the old model into the feature learning process of the new model. The enhanced hybrid features can effectively mitigate the increased new class bias caused by the lack of old class information. Under the incremental learning experimental settings of the 3D point cloud datasets ModelNet40, ScanObjectNN, ScanNet, and ShapeNet, extensive experiments show that compared with existing state-of-art methods, the method in this study has an average incremental accuracy improvement of 2.03 %, 2.18 %, 1.65 %, and 1.28 % on the four datasets.
    Available online:  June 28, 2024 , DOI: 10.15888/j.cnki.csa.009575
    Abstract:
    The image segmentation of surface defects on solid oxide fuel cell (SOFC) is of great significance for the quality inspection of monolithic SOFC. Aiming at the problems of blurred edges and complex backgrounds of surface defect images of monolithic SOFC, this study proposed a self-attention fusion method for SOFC surface defect image segmentation. Firstly, a multi-channel self-attention module is proposed to enhance the inter-channel correlation and improve the channel representation. Secondly, a multi-scale attention fusion module is utilized to further improve the network’s ability to extract defect features at different scales; and finally, a triplet joint loss function is proposed to supervise the training process. Experiments show that the proposed method can effectively extract surface defects of monolithic SOFC while improving network segmentation performance.
    Available online:  June 28, 2024 , DOI: 10.15888/j.cnki.csa.009589
    Abstract:
    This study is designed to address the issues of group user authorization management and integrity verification for shared medical data. First, to prevent group users from overstepping their authority, authorization identifiers are introduced. Medical data owners use authorization identifiers to allocate different access rights to group users, according to user identities. The mathematical construction of authorization identifiers effectively ensures that it cannot be forged. Second, to record revoked users and deprive them of access rights, a revoked user list based on a skip list is introduced. As skip list can support fast lookup and insertion, the overhead of revoking a user is only O(logn). Afterward, the concrete process and mathematical design of shared data integrity verification are improved. Finally, the security analysis and simulation experiments prove the security and efficiency of the scheme.
    Available online:  June 28, 2024 , DOI: 10.15888/j.cnki.csa.009582
    Abstract:
    Clustering algorithm based on the minimum spanning tree (MST) can identify clusters with arbitrary shapes, but the algorithm has limitations in efficiently constructing a minimum spanning tree and identifying invalid edges and is easily influenced by noise points. This study proposes an MST clustering algorithm based on local density peaks and label propagation (DPMST) by combining the advantages of the density peaks clustering algorithm to find local density peaks and exclude noise points with the MST algorithm. The DPMST algorithm adopts the shared neighbors-based distance between local density peaks and uses the neighborhood information between local density peaks to efficiently construct minimum spanning trees and identify invalid edges, enabling the discovery of clusters with complex structures. Label propagation is used to enhance the strong labels and weaken the weak labels to refine wrong labels, which can improve the quality of clustering results, especially for border region points as well as revealing complex manifolds. The experimental results on several synthetic and real-world datasets show that the DPMST algorithm outperforms classical clustering algorithms DPC, MST, K-means, DBSCAN, AP, SC, and BIRCH.
    Available online:  June 28, 2024 , DOI: 10.15888/j.cnki.csa.009593
    Abstract:
    Semantic segmentation of remote sensing images plays a crucial role in environmental detection, land cover classification, and urban planning. Convolutional neural networks and their improved models are the mainstream methods for semantic segmentation of remote sensing images. However, these methods focus more on learning local contextual features and cannot effectively model the global distribution relationship between different objects, thereby restricting the segmentation performance of the model. To address this issue, this study constructs a global semantic relationship learning module based on convolutional neural networks, which fully learns the symbiotic relationships between different objects and effectively enhances the model’s representation ability. In addition, a multi-scale learning module is constructed to integrate global semantic relationships of different scales, given the scale differences of the objects to be segmented in the same scene. To evaluate the performance of the model, sufficient experiments are conducted on two commonly used remote sensing image datasets, Vaihingen and Potsdam. The experimental results show that the proposed method can achieve higher segmentation performance than existing models based on convolutional neural networks.
    Available online:  June 28, 2024 , DOI: 10.15888/j.cnki.csa.009594
    Abstract:
    In solid oncology, on fluorescence microscopy images of interphase nuclei processed with fluorescence in situ hybridization (FISH) technology, DNA amplification often appears as diffraction-limited blobs. Imaging conditions limit image quality, resulting in a low image signal-to-noise ratio of the image, serious background interference, and non-blob structure interference. Designing suitable blob detection methods to provide objective and quantitative data helps doctors diagnose cancer. The algorithm first uses three-layer wavelet multiscale summation to denoise the fluorescence image, then uses the multiscale Laplacian of Gaussian operator to enhance the blob area, and finally suppresses the non-blob area through unilateral second-order Gaussian kernels in four directions to complete blob detection. Experimental results show that for 83 images in the self-built database, the average F-score reaches 0.96, and the average running time is less than 0.5 s.
    Available online:  June 28, 2024 , DOI: 10.15888/j.cnki.csa.009596
    Abstract:
    Effective segmentation of clouds and their shadows is a critical issue in the field of remote sensing image processing. It plays a significant role in surface feature extraction, climate detection, atmospheric correction, and more. However, the complex features of clouds and cloud shadows in remote sensing images, characterized by their diverse, irregular distributions and fuzzy boundary information that is easily disturbed by the background, make accurate feature extraction challenging. Moreover, there are few networks specifically designed for this task. To address these issues, this study proposes a dual-path network combining vision Transformer (ViT) and D-UNet. The network is divided into two branches: one is a convolutional local feature extraction module based on the dilated convolution module of D-UNet, which introduces a multi-scale atrous spatial pyramid pooling (ASPP) to extract multi-dimensional features; the other branch comprehends the context semantics globally through the Vision Transformer, enhancing feature extraction. Finally, the study performs an upsampling through a feature fusion decoder. The model achieves superior performance on both a self-built dataset of clouds and cloud shadows and the publicly available HRC_WHU dataset, leading the second-best model by 0.52% and 0.44% in the MIoU metric, achieving 92.05% and 85.37%, respectively.
    Available online:  June 28, 2024 , DOI: 10.15888/j.cnki.csa.009602
    Abstract:
    In the task of 3D human pose estimation, the complex topology formed by the connection relationship between human joints presents a challenge. Effective capture of the connections between local joints is possible through modeling this structure with a graph convolutional network. Although non-adjacent joints lack direct physical connections, Transformer encoders establish contextual relationships between joints, which is crucial for better human posture inference due to the biomechanical constraints influencing human motion and pose, as well as the synergistic interaction of human joints. Balancing model performance with a reduction in the number of parameters is of particular importance for large-scale models. To tackle these challenges, a multi-layer spatial feature fusion network model (MLSFFN) based on graph convolution and Transformer is designed. This model proficiently fuses local and global spatial features with a relatively minimal parameter set. Experimental results demonstrate that the proposed method achieves a mean point per joint error (MPJPE) of 49.9 mm on the Human3.6M dataset with only 2.1M parameters. Moreover, the model demonstrates a robust generalization capability.
    Available online:  June 28, 2024 , DOI: 10.15888/j.cnki.csa.009621
    Abstract:
    Gansu painted pottery has the most complete spatial and temporal sequence among all kinds of painted pottery cultures in China. However, no study has been specifically designed for the style transfer of Gansu painted pottery. To promote the excellent traditional Chinese culture, this research constructs the first Gansu painted pottery dataset and proposes a new geometric style transfer method. The method generates a neural distortion field that deforms Gansu painted pottery into the geometric style of the target object while maintaining the texture of the pottery. Two modules are incorporated into the network structure, namely position embedding and feature enhancement, to improve the quality of feature encoding. Shape consistency loss and a smooth regularization term are introduced to the loss function to prevent the details of the painted pottery from mutating and improve the deformation effect. The experimental results show that the model can achieve large-scale geometric style transfer between Gansu painted pottery and objects from different classes, maintaining the details of the pottery and providing new visual experiences.
    Available online:  June 28, 2024 , DOI: 10.15888/j.cnki.csa.009627
    Abstract:
    This study proposes a deep learning model for short-term precipitation forecasting, called MSF-Net, to address the limitations of traditional methods. This model integrates multi-source data, including GPM historical precipitation data, ERA5 meteorological data, radar data, and DEM data. A meteorological feature extraction module is employed to learn the meteorological features of the multi-source data. An attention fusion prediction module is used to achieve feature fusion and short-term precipitation forecasting. The precipitation forecasting results of MSF-Net are compared with those of various artificial intelligence methods. Experimental results indicate that MSF-Net achieves optimal threat score (TS) and bias score (Bias). This suggests that it can enhance the effectiveness of data-driven precipitation forecasting within a 6-hour prediction horizon.
    Available online:  June 28, 2024 , DOI: 10.15888/j.cnki.csa.009595
    Abstract:
    The task of camouflaged object detection involves locating and identifying camouflaged objects in complex scenes. While deep neural network-based methods have been applied to this task, many of them struggle to fully utilize multi-level features of the target for extracting rich semantic information in complex scenes with interference, often relying solely on fixed-size features to identify camouflaged objects. To address this challenge, this study proposes a camouflaged object detection network based on multi-scale and neighbor-level feature fusion. This network comprises two innovative designs: the multi-scale feature perception module and the two-stage neighbor-level interaction module. The former aims to capture rich local-global contrast information in complex scenes by combining multi-scale features. The latter integrates features from adjacent layers to exploit cross-layer correlations and transfer valuable contextual information from the encoder to the decoder network. The proposed method has been evaluated on three public datasets: CHAMELEON, CAMO-Test, and COD10K-Test, and compared with the current mainstream methods. The experimental results demonstrate that the proposed method outperforms the current mainstream methods, achieving excellent performance across all metrics.
    Available online:  June 28, 2024 , DOI: 10.15888/j.cnki.csa.009521
    Abstract:
    Synthetic aperture radar (SAR) and optical image fusion aim to leverage the imaging complementarity of satellite sensors for generating more comprehensive geomorphological information. However, existing network models often exhibit low imaging accuracy during the fusion process due to the heterogeneity in data distribution of each single satellite sensor and differences in imaging physical mechanisms. This study proposes the DNAP-Fusion, a novel SAR and optical image fusion network that incorporates dual non-local attention perception. The proposed method utilizes a dual non-local perceptual attention module to extract structural information from SAR images and texture details from optical images within a multi-level image pyramid with a gradually decreasing spatial scale. It then fuses their complementary features in both spatial and channel dimensions. Subsequently, the fused features are injected into the upsampled optical image through image reconstruction, resulting in the final fusion outcome. Additionally, before network training, image encapsulation decisions are employed to enhance the commonality between objects in SAR and optical images within the same scene. Qualitative and quantitative experimental results demonstrate that the proposed method outperforms state-of-the-art (SOTA) multisensor fusion methods. The correlation coefficient (CC) in the objective evaluation indices is 0.990 6, and the peak signal to noise ratio (PSNR) is 32.156 0 dB. Moreover, the proposed method effectively fuses the complementary features of SAR and optical images, offering a valuable idea and method for enhancing the accuracy and effectiveness of remote sensing image fusion.
    Available online:  May 31, 2024 , DOI: 10.15888/j.cnki.csa.009562
    Abstract:
    The malicious use of facial recognition technology may lead to personal information leakage, posing a significant threat to individual privacy security. Safeguarding facial privacy through universal adversarial attacks holds crucial research significance. However, existing universal adversarial attack algorithms primarily focus on image classification tasks. When applied to facial recognition models, they often encounter challenges such as low attack success rates and noticeable perturbation generation. To address these challenges, this study proposes a universal adversarial attack method for face recognition based on commonality gradients. This method optimizes universal adversarial perturbation through the common gradient of the adversarial perturbations of multiple face images and uses dominant feature loss to improve the attack capability of the perturbation. Combined with the multi-stage training strategy, it achieves a balance between attack effect and visual quality. Experiments on public datasets prove that the method outperforms methods such as Cos-UAP and SGA in the attack performance on facial recognition models, and the generated adversarial samples have better visual effects, indicating the effectiveness of the proposed method.
  • 全文下载排行(总排行年度排行各期排行)
    摘要点击排行(总排行年度排行各期排行)

  • Article Search
    Search by issue
    Select AllDeselectExport
    Display Method:
    2000,9(2):38-41, DOI:
    [Abstract] (12591) [HTML] (0) [PDF ] (20733)
    Abstract:
    本文详细讨论了VRML技术与其他数据访问技术相结合 ,实现对数据库实时交互的技术实现方法 ,并简要阐述了相关技术规范的语法结构和技术要求。所用技术手段安全可靠 ,具有良好的实际应用表现 ,便于系统移植。
    1993,2(8):41-42, DOI:
    [Abstract] (9669) [HTML] (0) [PDF ] (30433)
    Abstract:
    本文介绍了作者近年来应用工具软件NU清除磁盘引导区和硬盘主引导区病毒、修复引导区损坏磁盘的 经验,经实践检验,简便有效。
    1995,4(5):2-5, DOI:
    [Abstract] (9176) [HTML] (0) [PDF ] (12677)
    Abstract:
    本文简要介绍了海关EDI自动化通关系统的定义概况及重要意义,对该EDI应用系统下的业务运作模式所涉及的法律问题,采用EDIFACT国际标准问题、网络与软件技术问题,以及工程管理问题进行了结合实际的分析。
    2016,25(8):1-7, DOI: 10.15888/j.cnki.csa.005283
    [Abstract] (8633) [HTML] () [PDF 1167952] (36686)
    Abstract:
    从2006年开始,深度神经网络在图像/语音识别、自动驾驶等大数据处理和人工智能领域中都取得了巨大成功,其中无监督学习方法作为深度神经网络中的预训练方法为深度神经网络的成功起到了非常重要的作用. 为此,对深度学习中的无监督学习方法进行了介绍和分析,主要总结了两类常用的无监督学习方法,即确定型的自编码方法和基于概率型受限玻尔兹曼机的对比散度等学习方法,并介绍了这两类方法在深度学习系统中的应用,最后对无监督学习面临的问题和挑战进行了总结和展望.
    2008,17(5):122-126, DOI:
    [Abstract] (7702) [HTML] (0) [PDF ] (46655)
    Abstract:
    随着Internet的迅速发展,网络资源越来越丰富,人们如何从网络上抽取信息也变得至关重要,尤其是占网络资源80%的Deep Web信息检索更是人们应该倍加关注的难点问题。为了更好的研究Deep Web爬虫技术,本文对有关Deep Web爬虫的内容进行了全面、详细地介绍。首先对Deep Web爬虫的定义及研究目标进行了阐述,接着介绍了近年来国内外关于Deep Web爬虫的研究进展,并对其加以分析。在此基础上展望了Deep Web爬虫的研究趋势,为下一步的研究奠定了基础。
    2011,20(11):80-85, DOI:
    [Abstract] (7573) [HTML] () [PDF 863160] (41030)
    Abstract:
    在研究了目前主流的视频转码方案基础上,提出了一种分布式转码系统。系统采用HDFS(HadoopDistributed File System)进行视频存储,利用MapReduce 思想和FFMPEG 进行分布式转码。详细讨论了视频分布式存储时的分段策略,以及分段大小对存取时间的影响。同时,定义了视频存储和转换的元数据格式。提出了基于MapReduce 编程框架的分布式转码方案,即Mapper 端进行转码和Reducer 端进行视频合并。实验数据显示了转码时间随视频分段大小和转码机器数量不同而变化的趋势。结
    1999,8(7):43-46, DOI:
    [Abstract] (7203) [HTML] (0) [PDF ] (22366)
    Abstract:
    用较少的颜色来表示较大的色彩空间一直是人们研究的课题,本文详细讨论了半色调技术和抖动技术,并将它们扩展到实用的真彩色空间来讨论,并给出了实现的算法。
    2007,16(9):22-25, DOI:
    [Abstract] (6444) [HTML] (0) [PDF ] (5282)
    Abstract:
    本文结合物流遗留系统的实际安全状态,分析了面向对象的编程思想在横切关注点和核心关注点处理上的不足,指出面向方面的编程思想解决方案对系统进行分离关注点处理的优势,并对面向方面的编程的一种具体实现AspectJ进行分析,提出了一种依据AspectJ对遗留物流系统进行IC卡安全进化的方法.
    2012,21(3):260-264, DOI:
    [Abstract] (6410) [HTML] () [PDF 336300] (43530)
    Abstract:
    开放平台的核心问题是用户验证和授权问题,OAuth 是目前国际通用的授权方式,它的特点是不需要用户在第三方应用输入用户名及密码,就可以申请访问该用户的受保护资源。OAuth 最新版本是OAuth2.0,其认证与授权的流程更简单、更安全。研究了OAuth2.0 的工作原理,分析了刷新访问令牌的工作流程,并给出了OAuth2.0 服务器端的设计方案和具体的应用实例。
    2011,20(7):184-187,120, DOI:
    [Abstract] (6253) [HTML] () [PDF 731903] (31756)
    Abstract:
    针对智能家居、环境监测等的实际要求,设计了一种远距离通讯的无线传感器节点。该系统采用集射频与控制器于一体的第二代片上系统CC2530 为核心模块,外接CC2591 射频前端功放模块;软件上基于ZigBee2006 协议栈,在ZStack 通用模块基础上实现应用层各项功能。介绍了基于ZigBee 协议构建无线数据采集网络,给出了传感器节点、协调器节点的硬件设计原理图及软件流程图。实验证明节点性能良好、通讯可靠,通讯距离较TI 第一代产品有明显增大。
    2004,13(10):7-9, DOI:
    [Abstract] (5962) [HTML] (0) [PDF ] (10250)
    Abstract:
    本文介绍了车辆监控系统的组成,研究了如何应用Rockwell GPS OEM板和WISMOQUIKQ2406B模块进行移动单元的软硬件设计,以及监控中心 GIS软件的设计.重点介绍嵌入TCP/IP协议处理的Q2406B模块如何通过AT指令接入Internet以及如何和监控中心传输TCP数据.
    2008,17(1):113-116, DOI:
    [Abstract] (5877) [HTML] (0) [PDF ] (48209)
    Abstract:
    排序是计算机程序设计中一种重要操作,本文论述了C语言中快速排序算法的改进,即快速排序与直接插入排序算法相结合的实现过程。在C语言程序设计中,实现大量的内部排序应用时,所寻求的目的就是找到一个简单、有效、快捷的算法。本文着重阐述快速排序的改进与提高过程,从基本的性能特征到基本的算法改进,通过不断的分析,实验,最后得出最佳的改进算法。
    2008,17(8):87-89, DOI:
    [Abstract] (5815) [HTML] (0) [PDF ] (40304)
    Abstract:
    随着面向对象软件开发技术的广泛应用和软件测试自动化的要求,基于模型的软件测试逐渐得到了软件开发人员和软件测试人员的认可和接受。基于模型的软件测试是软件编码阶段的主要测试方法之一,具有测试效率高、排除逻辑复杂故障测试效果好等特点。但是误报、漏报和故障机理有待进一步研究。对主要的测试模型进行了分析和分类,同时,对故障密度等参数进行了初步的分析;最后,提出了一种基于模型的软件测试流程。
    2019,28(6):1-12, DOI: 10.15888/j.cnki.csa.006915
    [Abstract] (5738) [HTML] (17449) [PDF 672566] (17137)
    Abstract:
    知识图谱是以图的形式表现客观世界中的概念和实体及其之间关系的知识库,是语义搜索、智能问答、决策支持等智能服务的基础技术之一.目前,知识图谱的内涵还不够清晰;且因建档不全,已有知识图谱的使用率和重用率不高.为此,本文给出知识图谱的定义,辨析其与本体等相关概念的关系.本体是知识图谱的模式层和逻辑基础,知识图谱是本体的实例化;本体研究成果可以作为知识图谱研究的基础,促进知识图谱的更快发展和更广应用.本文罗列分析了国内外已有的主要通用知识图谱和行业知识图谱及其构建、存储及检索方法,以提高其使用率和重用率.最后指出知识图谱未来的研究方向.
    2008,17(8):2-5, DOI:
    [Abstract] (5722) [HTML] (0) [PDF ] (31088)
    Abstract:
    本文介绍了一个企业信息门户中单点登录系统的设计与实现。系统实现了一个基于Java EE架构的结合凭证加密和Web Services的单点登录系统,对门户用户进行统一认证和访问控制。论文详细阐述了该系统的总体结构、设计思想、工作原理和具体实现方案,目前系统已在部分省市的广电行业信息门户平台中得到了良好的应用。
    2004,13(8):58-59, DOI:
    [Abstract] (5643) [HTML] (0) [PDF ] (26747)
    Abstract:
    本文介绍了Visual C++6.0在对话框的多个文本框之间,通过回车键转移焦点的几种方法,并提出了一个改进方法.
    2009,18(5):182-185, DOI:
    [Abstract] (5597) [HTML] (0) [PDF ] (32166)
    Abstract:
    DICOM 是医学图像存储和传输的国际标准,DCMTK 是免费开源的针对DICOM 标准的开发包。解读DICOM 文件格式并解决DICOM 医学图像显示问题是医学图像处理的基础,对医学影像技术的研究具有重要意义。解读了DICOM 文件格式并介绍了调窗处理的原理,利用VC++和DCMTK 实现医学图像显示和调窗功能。
    (), DOI:
    [Abstract] (5593) [HTML] (19) [PDF ] (14)
    Abstract:
    2009,18(3):164-167, DOI:
    [Abstract] (5582) [HTML] (0) [PDF ] (27751)
    Abstract:
    介绍了一种基于DWGDirectX在不依赖于AutoCAD平台的情况下实现DWG文件的显示、操作、添加的简单的实体的方法,并对该方法进行了分析和实现。
  • 全文下载排行(总排行年度排行各期排行)
    摘要点击排行(总排行年度排行各期排行)

  • Article Search
    Search by issue
    Select AllDeselectExport
    Display Method:
    2007,16(10):48-51, DOI:
    [Abstract] (4752) [HTML] (0) [PDF 0.00 Byte] (86837)
    Abstract:
    论文对HDF数据格式和函数库进行研究,重点以栅格图像为例,详细论述如何利用VC++.net和VC#.net对光栅数据进行读取与处理,然后根据所得到的象素矩阵用描点法显示图像.论文是以国家气象中心开发Micaps3.0(气象信息综合分析处理系统)的课题研究为背景的.
    2002,11(12):67-68, DOI:
    [Abstract] (3936) [HTML] (0) [PDF 0.00 Byte] (58089)
    Abstract:
    本文介绍非实时操作系统Windows 2000下,利用VisualC++6.0开发实时数据采集的方法.所用到的数据采集卡是研华的PCL-818L.借助数据采集卡PCL-818L的DLLs中的API函数,提出三种实现高速实时数据采集的方法及优缺点.
    2008,17(1):113-116, DOI:
    [Abstract] (5877) [HTML] (0) [PDF 0.00 Byte] (48209)
    Abstract:
    排序是计算机程序设计中一种重要操作,本文论述了C语言中快速排序算法的改进,即快速排序与直接插入排序算法相结合的实现过程。在C语言程序设计中,实现大量的内部排序应用时,所寻求的目的就是找到一个简单、有效、快捷的算法。本文着重阐述快速排序的改进与提高过程,从基本的性能特征到基本的算法改进,通过不断的分析,实验,最后得出最佳的改进算法。
    2008,17(5):122-126, DOI:
    [Abstract] (7702) [HTML] (0) [PDF 0.00 Byte] (46655)
    Abstract:
    随着Internet的迅速发展,网络资源越来越丰富,人们如何从网络上抽取信息也变得至关重要,尤其是占网络资源80%的Deep Web信息检索更是人们应该倍加关注的难点问题。为了更好的研究Deep Web爬虫技术,本文对有关Deep Web爬虫的内容进行了全面、详细地介绍。首先对Deep Web爬虫的定义及研究目标进行了阐述,接着介绍了近年来国内外关于Deep Web爬虫的研究进展,并对其加以分析。在此基础上展望了Deep Web爬虫的研究趋势,为下一步的研究奠定了基础。

External Links

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063