Abstract:Significant radiometric and geometric discrepancies among multimodal remote sensing images present substantial challenges for achieving high-precision registration. To address these issues, this study proposes a phase-congruency-enhanced adjacent self-similarity matching method, termed as PC-ASS, for multimodal remote sensing imagery. First, a multi-scale image representation is constructed via nonlinear diffusion filtering to suppress noise while preserving common edges and structural information, thus providing a reliable foundation for subsequent feature detection. Next, phase congruency amplitude maps are computed using multi-scale, multi-orientation Log-Gabor filters to characterize structurally salient regions in the images. The phase congruency amplitudes are then used as weighting factors in computing adjacent self-similarity responses, thus enhancing structural features: regions with higher phase congruency yield stronger responses, increasing both the number and the quality of robust features such as edges and corners. Furthermore, during descriptor construction, a phase-congruency-weighted mechanism is incorporated into the polar statistical histogram framework, weighting each pixel’s adjacent self-similarity value by its phase congruency amplitude. This ensures that structurally salient regions contribute more prominently to the descriptor, thereby improving robustness against noise, texture interference, and cross-modal radiometric differences. Finally, incorrect matches are eliminated through a nearest-neighbor distance ratio strategy combined with the fast sample consensus (FSC) algorithm, enabling high-precision registration. Comparative experiments on three publicly available multimodal remote sensing datasets against five representative methods (PSO-SIFT, OSS, HAPCG, RIFT, and ASS) demonstrate that PC-ASS outperforms existing approaches in average correct matches, mean root-mean-square error, and correct matching rate, highlighting its robustness and broad applicability.