Leveraging deep semantic segmentation for assisted weed detection

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
Authors
In agriculture, it is crucial to identify and control weeds as these plant species pose a significant threat to the growth and development of crops by competing for vital resources such as nutrients, water, and light. A promising solution to this problem is adopting smart weed control systems (SWCS) that significantly reduce the use of harmful chemical products. Furthermore, SWCS leads to reduced production costs and a more sustainable and eco-friendly approach to farming. However, implementing SWCS in natural fields can be challenging, mainly due to difficulties in accurately localizing plants. To address this issue, a visual identification system can be employed to label plants from images using a process known as semantic segmentation. In this work, we have implemented, validated, and compared three deep learning approaches, including Mask Region-based Convolutional Neural Network (Mask R-CNN), Mask R-CNN enhanced with an Atrous Spatial Pyramid Pooling module (Mask R-CNN-ASPP), and a proposed model named Residual U-Net architecture, for the semantic pixel segmentation of high densities of both crops (Zea mays) and weeds (including narrow-leaf weeds and broad-leaf weeds). Data augmentation and transfer learning have also been implemented. The performance of the models was evaluated with the well-known metrics Precision, Recall, Dice similarity coefficient (DSC), and mean Intersection-Over-Union (mIoU). As a result of the analysis, the DSC and mIoU of Mask R-CNN-ASPP based models were up to 10.63% and 10.54% superior to that of the Mask R-CNN based models. Nonetheless, the proposed Residual U-Net architecture outperformed Mask R-CNN-ASPP based networks in all the metrics, reaching a DSC of 92.98% and mIoU of 87.12%. Thus, we have concluded that the proposed Residual U-Net-like architecture is the best alternative for the semantic segmentation task in images with high plant density. Our research addresses the challenge of weed identification and control in agriculture, helping farmers produce crops more efficiently while minimizing environmental impact.
How to Cite

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
PAGEPress has chosen to apply the Creative Commons Attribution NonCommercial 4.0 International License (CC BY-NC 4.0) to all manuscripts to be published.