Abstract Detail

The Potential of Machine Learning for Plant Biology

Mata-Montero, Erick [1], Mora-Fallas, Adan [1], Goëau, Hervé [2], Bonnet, Pierre [2], Joly, Alexis [3].

Automated visual analysis of Herbarium specimens for phenological data extraction.

Herbarium specimens provide invaluable material for large scale studies of plant ecology and taxonomy (Soltis et al., 2018). With recent monumental efforts carried out to digitize natural history collections, the scientific community has new opportunities to investigate research questions that were impossible to address or too time consuming in the past. Nevertheless, the problem of manually extracting basic information from a huge number of herbarium sheets remains an important limiting factor, even if those sheets have been digitized. As we think that deep learning technologies can contribute to accelerate these extractions, we specifically investigate the potential of such approach for phenological data extraction as a follow-up of previous work (Carranza-Rojas et al., 2017 ; Lorieul et al., 2019). Instance segmentation techniques have recently demonstrated their effectiveness on a large diversity of contexts (e.g., manufactured objects, human or animals detections). The Mask-RCNN method, in particular, seems to be a promising approach in the context of herbarium sheets. We propose in this work to evaluate the Mask-RCNN method for reproductive organs detection and counting, in a global system that enables an active learning mechanism (in order to minimize the number of organ masks that researchers must manually annotate). We discuss experiments by addressing the effectiveness, the limits and the time required of our approach in the context of a phenological study on several hundreds of herbarium specimens, and the analysis of several thousands of reproductive organs. This study opens the door to a much larger number of routine tasks that are now possible to address with machine-learning approaches. Consequently, we expect the botanical community to increase its capacities and accelerate the pace of scientific discoveries. Soltis, P. S., Nelson, G., & James, S. A. (2018). Green digitization: Online botanical collections data answering real‐world questions. Applications in Plant Sciences, 6(2), e1028. Carranza-Rojas, J., Goeau, H., Bonnet, P., Mata-Montero, E., & Joly, A. (2017). Going deeper in the automated identification of Herbarium specimens. BMC evolutionary biology, 17(1), 181. Lorieul, T., Pearson, K. D., Ellwood, E. R., Goëau, H., Molino, J. F., Sweeney, P. W., ... & Soltis, P. S. (2019). Toward a large‐scale and deep phenological stage annotation of herbarium specimens: Case studies from temperate, tropical, and equatorial floras. Applications in Plant Sciences, 7(3), e01233.

1 - Costa Rica Institute of Technology, Computing, Calle 15, Avenida 14, 1 km Sur de la Basílica de los Ángeles, Cartago, Cartago, 30101, Costa Rica
2 - AMAP, Université de Montpellier, CIRAD, CNRS, INRA, IRD, Montpellier, CEDEX 5, France
3 - Institut national de recherche en informatique et en automatique (INRI, ZENITH team,, Laboratory of Informatics, Robotics and Microelec, Montpellier, CEDEX 5, France

deep learning
machine learning
convolutional neural network
herbarium data
natural history collections.

Presentation Type: Symposium Presentation
Abstract ID:768
Candidate for Awards:None

Copyright © 2000-2019, Botanical Society of America. All rights reserved