To address the limitations of existing random convolutional kernel transformation (ROCKET) methods in terms of feature extraction depth and nonlinear modeling capability, a method termed ML-ROCKET is proposed. The proposed method extends ROCKET into a multi-layer structure, enabling layer-wise extraction of nonlinear features to enrich feature representations and enhance classification accuracy. A two-dimensional convolution structure combined with sequential pooling operations is adopted to improve the modeling of multivariate interactions and intrinsic temporal patterns. Furthermore, a sequential feature detachment (SFD) pruning strategy is introduced to further optimize both performance and inference speed. Experimental results demonstrate that the proposed method achieves classification accuracy comparable to state-of-the-art models on 109 univariate UCR datasets and 26 multivariate UEA datasets, while significantly outperforming most existing methods in training efficiency.