Abstract:The detection of young grape cluster fruits is challenging due to the influence of background color, occlusion, and lighting variations. To achieve robust detection of young grape cluster fruits for the varying conditions, an improved YOLO v8n model that integrated shuffle attention (SA) mechanism was proposed in the work. By incorporating SA mechanism into the Neck network of the YOLO v8n model, the multi-scale feature fusion ability of the network was enhanced, the feature information representation of the detection target was improved, and other irrelevant information was suppressed, improving the accuracy of the detection network, which achieved efficient and accurate detection of young grape cluster fruits without significantly increasing network depth and memory overhead. Wise intersection over union loss (Wise-IoU Loss) with the dynamic nonmonotonic focusing mechanism was taken as the bounding box regression loss function, to accelerate the network convergence for the better detection accuracy of the model. Herein, a Grape dataset was constructed, which comprised 3 780 images of young grape cluster fruits in complex scenarios along with corresponding annotation files. Training and testing results of the SAW-YOLO v8n model on this dataset showed that the precision (P), recall (R), mean average precision (mAP), and F1 score of the young grape cluster fruit detection algorithm based on SAW-YOLO v8n were 92.80%, 91.30%, 96.10%, and 92.04%, respectively, where the detection speed was 140.85 f/s, and the model size was only 6.20 MB. Compared with that of SSD, YOLO v5s, YOLO v6n, YOLO v7-tiny, and YOLO v8n, the mAP was increased by 16.06%, 1.05%, 1.48%, 0.84% and 0.73%, respectively, and F1 scores were increased by 24.85%, 1.43%, 1.43%, 1.09% and 1.60%, respectively, and the model weights were reduced by 93.16%, 56.94%, 37.63%, 47.00%, and 0, respectively, which was the smallest among all models and had obvious advantages in lightweight and high accuracy. Moreover, the young grape cluster fruits detection with different degrees of occlusion and lighting conditions were also explored, and the result showed that the young grape cluster fruit detection method based on SAW-YOLO v8n can adapt to different occlusion and lighting changes, and had good robustness. In summary, SAW-YOLO v8n not only met the requirements of high-precision, high-speed, and lightweight detection of young grape cluster fruits, but also had strong robustness and real-time performance.