Tomato detection in natural environment based on improved YOLOv8 network

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
Authors
In this paper, an improved lightweight YOLOv8 method is proposed to detect the ripeness of tomato fruits, given the problems of subtle differences between neighboring stages of ripening and mutual occlusion of branches, leaves, and fruits. The method replaces the backbone network of the original YOLOv8 with a more lightweight MobileNetV3 structure to reduce the number of parameters of the model; at the same time, it integrates the convolutional attention mechanism module (CBAM) in the feature extraction network, which enhances the network's capability of extracting features of tomato fruits. At the same time, it introduces the SCYLLA-IoU (SIoU) as a bounded YOLOv8 frame regression loss function, effectively solving the mismatch problem between the predicted frame and the actual frame and improving recognition accuracy. Compared with the current mainstream models Resnet50, VGG16, YOLOv3, YOLOv5, YOLOv7, etc., the model is in an advantageous position regarding precision rate, recall rate, and detection accuracy. The research and experimental results show that the mean values of precision, recall rate, and average precision of the improved MCS-YOLOv8 model under the test set are 91.2%, 90.2%, and 90.3%, respectively. The detection speed of a single image is 5.4ms, and the model occupies less memory by 8.7 M. The model has a clear advantage in both detection speed and precision rate and also shows that the improved MCS-YOLOv8 model can provide strong technical support for tomato-picking robots in complex environments in the field.
Supporting Agencies
Hebei ProvinceHow to Cite

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.