Original Articles

Empowering farmers with artificial intelligence: a retrieval-augmented generation based large language model advisory framework

Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
Published: 27 January 2026
1110
Views
500
Downloads

Authors

This study presents a retrieval augmented generation (RAG) based system designed to provide farmers with expert agricultural advisory services. The framework delivers context aware guidance on critical practices such as crop cultivation, pest and disease management, fertilizer application, and other agronomic practices, and compares the performance of four large language models (LLMs) in generating these recommendations. The system processes package of practices (PoP) documents for five major crops maize, ragi, sweet potato, cotton, and groundnut through semantic chunking and embedding using Amazon Titan via BedrockEmbeddings. Vector representations are indexed in ChromaDB to enable efficient similarity search for query-relevant content retrieval. Upon receiving user queries, the system retrieves the most semantically similar document chunks and incorporates them into structured prompts. Four LLMs such as Llama3.1, Mistral, Phi3, and Qwen2.5 were evaluated for their effectiveness in generating accurate agricultural recommendations. Performance was evaluated across multiple dimensions. Relevance and retrieval were assessed using precision@K, recall@K, mean reciprocal rank (MRR), and normalized discounted cumulative gain (NDCG). Lexical overlap was measured with the bilingual evaluation understudy (BLEU) and recall-oriented understudy for gisting evaluation (ROUGE-1, ROUGE-2, ROUGE-L) metrics. Semantic quality was analyzed using Bidirectional Encoder Representations from transformers score (BERTScore) precision, recall, F1, semantic similarity and faithfulness to capture contextual alignment between generated and reference responses. Source attribution was assessed through the attribution score, while efficiency was measured using retrieval time, generation time, and total time. Overall, mistral and Qwen2.5 achieved the highest performance, demonstrating superior relevance, semantic quality, and efficiency. This evaluation highlights which LLMs perform best for the agricultural domain and illustrates the potential of knowledge-grounded AI systems to democratize agricultural expertise, particularly in regions with limited access to traditional advisory services.

Downloads

Download data is not yet available.

Citations

A S, Krishnan AG, V G, 2024. Leveraging technology to empower millet farmers a retrieval-augmented generation approach with large language models. Proc. 5th IEEE Global Conf. Advancement in Technology (GCAT), Bangalore; pp. 1-7. DOI: https://doi.org/10.1109/GCAT62922.2024.10923869
Acharya DB, Kuppan K, Divya B, 2025. Agentic AI: autonomous intelligence for complex goals - A comprehensive survey. IEEE Access 13:18912-18936. DOI: https://doi.org/10.1109/ACCESS.2025.3532853
Arslan M, Ghanema H, Munawar, S, Cruza C, 2024. A survey on RAG with LLMs. Procedia Comput Sci 246:3781-3790. DOI: https://doi.org/10.1016/j.procs.2024.09.178
Balpande M, Mahajan K, Bhandarkar J, Borse G, Badjat S, 2024. AI powered agriculture optimization chatbot using RAG and GenAI. Proc IEEE Silchar Subsection Conf. (SILCON 2024), Agartala; pp. 1-6. DOI: https://doi.org/10.1109/SILCON63976.2024.10910462
Dhanabalan T, Sathish A, 2018. Transforming Indian industries through artificial intelligence and robotics in industry 4.0. Int J Mech Eng Technol 9:835-845.
Government of Kerala, Directorate of Economics and Statistics, EARAS Division, 2020. Agricultural Statistics 2018-19. Available from: https://ecostat.kerala.gov.in/storage/publications/239.pdf
Hu R, Liu S, Qi P, Liu J, Li F, 2025. ICCA-RAG: intelligent customs clearance assistant using retrieval-augmented generation (RAG). IEEE Access 13:39711-39726. DOI: https://doi.org/10.1109/ACCESS.2025.3544408
Irican BB, Sivri M, Kokach V, Kocacinar B, Akbulut FP, 2024. QBot: domain-specific chatbots with retrieval-augmented generation and vector embedding for complex documentation queries. Proc Innovations in Intelligent Systems and Applications Conf. (ASYU), Ankara; pp. 1-6. DOI: https://doi.org/10.1109/ASYU62119.2024.10757125
K S NP, S S, T N T, Yuvraaj Y, D A V, 2023. Conversational chatbot builder – smarter virtual assistance with domain specific AI. Proc. 4th Int. Conf. Emerging Technology (INCET), Belgium; pp. 1-4. DOI: https://doi.org/10.1109/INCET57972.2023.10170114
Kar R, Haldar R. 2016. Applying chatbots to the internet of things: opportunities and architectural elements. arXiv:1611.03799. DOI: https://doi.org/10.14569/IJACSA.2016.071119
Khanifar J, 2025. Evaluating AI-generated responses from different chatbots to soil science-related questions. Soil Adv 3:100034. DOI: https://doi.org/10.1016/j.soilad.2025.100034
Kim M, Kim D, Park Y, Jeong D, 2024. Development of an expert chatbot for digital forensics using RAG model implementation. Proc Int. Conf. Platform Technology and Service (PlatCon), Jeju; pp. 182-187. DOI: https://doi.org/10.1109/PlatCon63925.2024.10830748
Legashev L, Shukhman A, Badikov V, Kurynov V, 2025. Using large language models for goal-oriented dialogue systems. Appl Sci 15:4687. DOI: https://doi.org/10.3390/app15094687
Mathebula M, Modupe A, Marivate V, 2024. Fine-tuning retrieval-augmented generation with an auto-regressive language model for sentiment analysis in financial reviews. Appl Sci 14:10782. DOI: https://doi.org/10.3390/app142310782
Meng W, Li Y, Chen L, Dong Z, 2025. Using the retrieval-augmented generation to improve the question-answering system in human health risk assessment: the development and application. Electronics 14:386. DOI: https://doi.org/10.3390/electronics14020386
P K, M H, Hayagreevan V, 2025. Development of interactive assistance for academic preparation using large language models. Proc Int. Conf. Computational, Communication and Information Technology (ICCCIT), Indore; pp. 265-269. DOI: https://doi.org/10.1109/ICCCIT62592.2025.10928137
Saha B, Saha U, Zubair Malik M, 2024. QuIM-RAG: advancing retrieval-augmented generation with inverted question matching for enhanced QA performance. IEEE Access 12:185401-185410. DOI: https://doi.org/10.1109/ACCESS.2024.3513155
V N, G A S, S G, M K, A M, S T, 2024. AgriBot: An integrated chatbot platform for precision agriculture and farmer support using deep learning techniques. Proc Int. Conf. Power, Energy, Control and Transmission Systems (ICPECTS), Chennai; pp. 1-6. DOI: https://doi.org/10.1109/ICPECTS62210.2024.10780432
Wilkho RS, Chang S, Gharaibeh NG, 2023. FF-BERT: A BERT-based ensemble for automated classification of web-based text on flash flood events. Adv Eng Inform 59:102293. DOI: https://doi.org/10.1016/j.aei.2023.102293
Zhou B, Zou L, Mostafavi A, Lin A, Yang M, Gharaibeh N, et al. 2022. VictimFinder: harvesting rescue requests in disaster response from social media with BERT. Compu. Environ Urban Syst 95:101824. DOI: https://doi.org/10.1016/j.compenvurbsys.2022.101824
Xiong J, Pan L, Liu Y, Zhu L, Zhang L, Tan S, 2025. Enhancing plant protection knowledge with large language models: a fine-tuned question-answering system using LoRA. Appl Sci 15:3850. DOI: https://doi.org/10.3390/app15073850
Yin S, Xi Y, Zhang X, Sun C, Mao Q, 2025. Foundation models in agriculture: a comprehensive review. Agriculture 15:847. DOI: https://doi.org/10.3390/agriculture15080847
Zafarmomen N, Samadi V, 2025. Can large language models effectively reason about adverse weather conditions? Environ Model Softw 188:106421. DOI: https://doi.org/10.1016/j.envsoft.2025.106421
Zhang W, Zhang J, 2025. Hallucination mitigation for retrieval-augmented large language models: a review. Mathematics 13:856. DOI: https://doi.org/10.3390/math13050856

CRediT authorship contribution

Shreeram Sawant contributed to conception and design of the RAG-based LLM advisory framework, analysis and interpretation of experimental results, drafting of the original manuscript, and critical revision for important intellectual content. Rahul Nair contributed to conception and design of the system architecture, analysis and interpretation of performance data, drafting of methodology sections, and critical revision for important intellectual content. Siddharth Hariharan contributed to conception and design of the research approach, analysis and interpretation of results, critical revision of the manuscript for important intellectual content. All authors provided final approval of the version to be published and agreed to be accountable for all aspects of the work.

Data Availability Statement

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Dataset: https://doi.org/10.5281/zenodo.15881813
Code: https://doi.org/10.5281/zenodo.17542352

How to Cite



“Empowering farmers with artificial intelligence: a retrieval-augmented generation based large language model advisory framework” (2026) Journal of Agricultural Engineering, 57(2). doi:10.4081/jae.2026.1908.