Abstract:Considering challenges such as diverse crop varieties, complex disease types, significant sample data imbalance, varied prescription categories, and multi-modal data, prescription recommendation methods tailored to diverse, extensible, and multi-modal application scenarios were explored by using multi-modal EMR data. To accommodate the varying prescription preferences of agricultural producers, a diversified crop disease prescription recommendation model based on CdsBERT-RCNN and diagnostic reasoning was developed, improving diagnostic accuracy and prescription diversity for 32 common diseases. For untrained rare diseases and newly added prescriptions, an extensible crop disease prescription recommendation model based on MC-SEM and semantic retrieval was developed, enhancing semantic matching accuracy and case library retrieval speed, and providing prescription recommendations for untrained diseases. For multimodal information collection and input, a multi-modal crop disease prescription recommendation model based on BATNet multi-layer feature fusion was developed, enhancing prescription recommendation performance for multimodal data inputs. Results demonstrated that CdsBERT-RCNN achieved an 85.65% diagnostic accuracy and an F1 score of 85.63% across the 32 common diseases. In tests with varying input completeness levels, the model achieved 81.19% accuracy with symptom information alone, and the inclusion of environmental and crop information improved accuracy by 1.65 percentage points and 3.61 percentage points, respectively. MC-SEM achieved a Pearson correlation coefficient of 86.34% and a Spearman correlation coefficient of 77.67% for EMR semantic matching tasks;and achieved accuracy of 88.20% and 82.04% in the closed-set and open-set prescription recommendation tests, respectively, demonstrating its capability to expand to untrained diseases. BATNet achieved an accuracy and F1 score of 98.88% and 98.83%, respectively, for multi-modal input prescription recommendation tasks. Application scenario analysis and testing validated the model’s generalization capability for incomplete modalities (pure text or pure image) and incomplete information input (crop, environment, symptoms). The research result would provide an idea for digitally enabled crop disease control decision-making.