Research Project
Understanding the extremely variable, complex shape and venation characters of angiosperm leaves is one of the most challenging problems in botany.
Machine learning offers opportunities to analyze large numbers of specimens, to discover novel leaf features of angiosperm clads that may have phylogenetic significance and to use those characters to classify unknowns. It remains an open question whether learning and classification are possible among major evolutionary groups such as families and orders, which usually contain hundreds to thousands of species each and exhibit many times the foliar variation of individual species. Here we tested whether a computer vision algorithm could use a large database of leaf images from 2,001 genera to learn features of botanical families and orders, then classify as novel images. The resulting automated system learned to classify images into families and orders with a success rate many times greater than chance.