Abstract:Enhancing the mathematical reasoning ability of large language models is a central focus of current research. Among existing approaches, chain-of-thought techniques significantly improve model performance through prompt optimization. However, most existing methods heavily rely on the distillation of extremely large models, while the intrinsic potential of the models themselves is largely overlooked. To address this limitation, this study proposes MCTS-EF, a mathematical reasoning enhancement framework for language models based on Monte Carlo tree search (MCTS) and self-consistency. The framework leverages the model’s own validation feedback capability in conjunction with MCTS to achieve dynamic error correction. In addition, self-consistency is employed to mitigate hallucinations and further enhance reasoning performance, without relying on distillation, fine-tuning, or external reward models. Through the collaborative mechanism of self-feedback, verification feedback, and in-context learning, the framework integrates MCTS with an evaluation-feedback loop to optimize reasoning trajectories, thereby fully activating the inherent mathematical reasoning potential of the model. The experimental results show that the accuracy of Qwen2-7B on the MATH-500 dataset increases from 44% to 68%, surpassing the performance of the Qwen2-72B model. Significant improvements are also observed across other model variants. Furthermore, the performance of related models in different situations within the proposed framework is systematically analyzed, providing methodological insights and technical directions for future research.