Abstract:Intelligent disinfection robots are a highly effective way of daily disinfection as it becomes regular. Robots usually perceive the surrounding environment through vision, but object detection based on supervised learning usually requires a large amount of labeled data for training. When the amount of labeled data is large, the cost of labeling is very high, and when the amount of labeled data is small, the model is prone to overfitting. Therefore, few-shot object detection is an effective solution. On the basis of the SimDet Model, this study proposes the SimDet+ model. First, according to the characteristics of the object detection task in a disinfection scene, the process of self-supervised pre-training is added. Second, as there are query images for reference, the classification layer is improved, where the cosine similarity instead of the fully connected layer is employed for confidence level calculation, and thus the overfitting phenomenon is effectively avoided through non-parametric calculation. For the disinfection scene, a 22-minute video dataset and a detection dataset containing eight categories of objects are produced and used in two stages separately for training. Through self-supervised pre-training, the cost of data labeling is effectively reduced, and the mAP of downstream tasks is increased from 0.216 2 to 0.530 2.