Empowering farmers with artificial intelligence: a retrieval-augmented generation based large language model advisory framework
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
Authors
This study presents a retrieval augmented generation (RAG) based system designed to provide farmers with expert agricultural advisory services. The framework delivers context aware guidance on critical practices such as crop cultivation, pest and disease management, fertilizer application, and other agronomic practices, and compares the performance of four large language models (LLMs) in generating these recommendations. The system processes package of practices (PoP) documents for five major crops maize, ragi, sweet potato, cotton, and groundnut through semantic chunking and embedding using Amazon Titan via BedrockEmbeddings. Vector representations are indexed in ChromaDB to enable efficient similarity search for query-relevant content retrieval. Upon receiving user queries, the system retrieves the most semantically similar document chunks and incorporates them into structured prompts. Four LLMs such as Llama3.1, Mistral, Phi3, and Qwen2.5 were evaluated for their effectiveness in generating accurate agricultural recommendations. Performance was evaluated across multiple dimensions. Relevance and retrieval were assessed using precision@K, recall@K, mean reciprocal rank (MRR), and normalized discounted cumulative gain (NDCG). Lexical overlap was measured with the bilingual evaluation understudy (BLEU) and recall-oriented understudy for gisting evaluation (ROUGE-1, ROUGE-2, ROUGE-L) metrics. Semantic quality was analyzed using Bidirectional Encoder Representations from transformers score (BERTScore) precision, recall, F1, semantic similarity and faithfulness to capture contextual alignment between generated and reference responses. Source attribution was assessed through the attribution score, while efficiency was measured using retrieval time, generation time, and total time. Overall, mistral and Qwen2.5 achieved the highest performance, demonstrating superior relevance, semantic quality, and efficiency. This evaluation highlights which LLMs perform best for the agricultural domain and illustrates the potential of knowledge-grounded AI systems to democratize agricultural expertise, particularly in regions with limited access to traditional advisory services.
Downloads
Citations
CRediT authorship contribution
Shreeram Sawant contributed to conception and design of the RAG-based LLM advisory framework, analysis and interpretation of experimental results, drafting of the original manuscript, and critical revision for important intellectual content. Rahul Nair contributed to conception and design of the system architecture, analysis and interpretation of performance data, drafting of methodology sections, and critical revision for important intellectual content. Siddharth Hariharan contributed to conception and design of the research approach, analysis and interpretation of results, critical revision of the manuscript for important intellectual content. All authors provided final approval of the version to be published and agreed to be accountable for all aspects of the work.
How to Cite

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.