Retrieval Augmented Generation Data Query Technique for Pineapple Cultivation
DOI:
https://doi.org/10.31272/jeasd.2732Keywords:
Pineapple cultivation, Agriculture, Large language model, Retrieval augmented generation, Generative artificial intelligenceAbstract
Generative artificial intelligence is advancing at a blistering pace. Large Language Models, in particular, have sped up the development of machine learning applications. This work presents a large language model-based technique to query data collected during MD2 pineapple crop production. Retrieval Augmented Generation was used to feed structured and unstructured data to two large language models (GPT-4 and LLAMA2) to train and fine-tune the models. The performance of the models was then measured using actual and predicted question-answer pairs. Results showed that the models had a 78% - 87% correct answer rate for structured and 75% - 79% correct answer rate for unstructured data. However, results showed that the models had a 61%-68 % correct answer rate when an answer to a question needed to refer to structured and unstructured data. These results showed that large language models can be further investigated to give farmers useful insights when making crop management decisions.
References
R. M. Q. De Ramos and E. B. Taboada, “Cradle-to-gate life cycle assessment of fresh and processed pineapple in the Philippines,” Nat. Environ. Pollut. Technol., vol. 17, no. 3, pp. 783–790, 2018. https://neptjournal.com/upload-images/NL-65-15-(13)D-740.pdf
R. D. Giamasrow, A. N. Azman, N. Zainol, M. S. A. Karim, and N. A. T. Yusof, “Fabrication of Cellulose Powder Dielectric Composite Material using Pineapple Leaves Fiber,” J. Adv. Res. Appl. Sci. Eng. Technol., vol. 38, no. 2, pp. 1–15, 2024. doi: https://doi.org/10.37934/araset.38.2.115
A. F. A. Hamzah, M. H., Hamzah, H. C., Man, N. S., Jamali, and S. I. Siajam, "Influence of Subcritical Water Pretreatment Temperature on Pineapple Waste Biogas Efficiency: Experimental and Kinetic Study," J. Eng. Sustain. Dev., vol. 28, no. 2, pp. 143–159, 2024, doi: https://doi.org/10.31272/jeasd.28.2.1
FAO, “Agriculture production data, License: CC BY-NC-SA 3.0 IGO,” Mar. 2022, Accessed: Mar. 25, 2024. [Online]. Available: https://www.fao.org/faostat/en/#data/QCL
S. Cotabato, “A Study on the Production Methods of Conventionally-grown Pineapples in the Philippines,” Magsasaka at Siyetipiko para sa Pag-unlad ng Agrikultura, February, pp. 1–25, 2015. https://pdfcoffee.com/conventional-pineaple-production-philippines-pdf-free.html
B. Abu Bakar et al., “A Review of Mechanization and Automation in Malaysia’s Pineapple Production,” Adv. Agric. Food Res. J., vol. 2, no. 1, pp. 1–13, 2021, doi: https://doi.org/10.36877/aafrj.a0000206
MAFI, “Dasar Agromakanan Negara 2.0 2021 - 2030 (National Agrofood Polizy 2.0 2021 - 2030)” Putrajaya, 2021.
A. S. Bujang and B. H. Abu Bakar, “Agriculture 4.0: Data-Driven Approach to Galvanize Malaysia’s Agro-Food Sector Development.”, In Proceedings of the FFTC-RDA International Symposium on "Developing Innovation Strategies in the Era of Data-driven Agriculture". Jeonju, Republic of Korea, p.1631, 2019, [Online]. Available: https://ap.fftc.org.tw/article/1631
A. Knierim, M. Kernecker, K. Erdle, T. Kraus, F. Borges, and A. Wurbs, “Smart farming technology innovations – Insights and reflections from the German Smart-AKIS hub,” NJAS - Wageningen J. Life Sci., vol. 90–91, Dec. 2019, doi: https://doi.org/10.1016/j.njas.2019.100314
C. Y. N. Norasma, A. R. M. Shariff, E. Jahanshiri, M. Amin, S. Khairunniza-Bejo, and A. R. Mahmud, “SCIENCE & TECHNOLOGY Web-Based Decision Support System for Paddy Planting Management,” Pertanika J. Sci. Technol, vol. 21, no. 2, pp. 343–364, 2013, [Online]. Available: http://www.pertanika.upm.edu.my/
W. X. Zhao et al., “A Survey of Large Language Models,” pp. 1–97, 2023.
A. Q. Jiang et al., “Mixtral of Experts,” 2024, [Online]. Available: https://arxiv.org/abs/2401.04088
H. Touvron et al., “Llama 2: Open Foundation and Fine-Tuned Chat Models,” 2023, [Online]. https://arxiv.org/abs/2307.09288
OpenAI et al., “GPT-4 Technical Report,” Mar. 2023, [Online]. Available: https://arxiv.org/abs/2303.08774
C. Ziems, W. Held, O. Shaikh, J. Chen, Z. Zhang, and D. Yang, "Can Large Language Models Transform Computational Social Science? Under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) license," Comput. Linguist., vol. 50, no. 1, 2024, doi: https://doi.org/0.1162/coli.
S. Hegselmann, A. Buendia, H. Lang, M. Agrawal, X. Jiang, and D. Sontag, “TabLLM: Few-shot Classification of Tabular Data with Large Language Models,” Proc. Mach. Learn. Res., vol. 206, pp. 5549–5581, 2023, [Online]. Available: https://arxiv.org/abs/2210.10723
W. Chen, H. Zha, Z. Chen, W. Xiong, H. Wang, and W. Wang, “HybridQA: A Dataset of Multi-Hop Question Answering over Tabular and Textual Data,” Find. Assoc. Comput. Linguist. Find. ACL EMNLP 2020, pp 1026–1036, 2020, doi: https://doi.org/10.18653/v1/2020.findings-emnlp.91
Y. Gao et al., “Retrieval-Augmented Generation for Large Language Models : A Survey,” Dec. 2023, [Online]. Available: https://arxiv.org/abs/2312.10997
Y. Lee, S. Kim, T. Yu, R. A. Rossi, and X. Chen, “Learning to Reduce: Optimal Representations of Structured Data in Prompting Large Language Models,” Feb. 2024, [Online]. Available: http://arxiv.org/abs/2402.14195
H. Kabir and N. Garg, “Machine learning enabled orthogonal camera goniometry for accurate and robust contact angle measurements,” Sci. Rep., vol. 13, no. 1, pp. 1–13, 2023, https://doi.org/10.1038/s41598-023-28763-1.
V. Bolón-Canedo and B. Remeseiro, “Feature selection in image analysis: a survey,” Artif. Intell. Rev., vol. 53, no. 4, pp. 2905–2931, 2020, https://doi.org/10.1007/s10462-019-09750-3.
PIP, “Crop production protocol; Production Guide for the Production of the Pineapple Variety MD2 (a handbook for farm managers and technicians),” vol. 2, no. December, p. 60, 2011, [Online]. Available: www.coleacp.org/pip
Downloads
Key Dates
Received
Revised
Accepted
Published Online First
Published
Issue
Section
License
Copyright (c) 2025 Badril Abu Bakar, Siti Noor Aliah Baharom, Mohd. Taufik Ahmad, Mohd. Nizam Zubir, Adli Fikri Ahmad Sayuti, Mohd Aufa Mhd Bookeri (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.