TY  - EJOU
AU  - Zhao, Changxu 
AU  - Liu, Jianping 
AU  - Wang, Xiaofeng 
AU  - Sun, Wei 
AU  - Liu, Libo 
AU  - Ren, Haiyu 
AU  - Liu, Pan 
AU  - Wang, Qiantong 

TI  - A Review of Foundation Models for Multi-Task Agricultural Question Answering
T2  - Computers, Materials \& Continua

PY  - 2026
VL  - 87
IS  - 2
SN  - 1546-2226

AB  - Foundation models are reshaping artificial intelligence, yet their deployment in specialised domains such as agricultural question answering (AQA) still faces challenges including data scarcity and barriers to domain-specific knowledge. To systematically review recent progress in this area, this paper adopts a task–paradigm perspective and examines applications across three major AQA task families. For text-based QA, we analyse the strengths and limitations of retrieval-based, generative, and hybrid approaches built on large language models, revealing a clear trend toward hybrid paradigms that balance precision and flexibility. For visual diagnosis, we discuss techniques such as cross-modal alignment and prompt-driven generation, which are pushing systems beyond simple pest and disease recognition toward deeper causal reasoning. For multimodal reasoning, we show how the fusion of heterogeneous data—including text, images, speech, and sensor streams—enables comprehensive decision-making for diagnosis, monitoring, and yield prediction. To address the lack of unified benchmarks, we further propose a standardised evaluation protocol and a diagnostic taxonomy specifically designed to characterise agriculture-specific errors. Finally, we outline a concrete AQA roadmap that emphasises safety alignment, hallucination control, and lightweight deployment, aiming to guide future systems toward greater efficiency, trustworthiness, and sustainability.
KW  - Foundation models; agricultural question answering; multimodal learning; large language models; smart agriculture; artificial intelligence

DO  - 10.32604/cmc.2025.074409