Apply for community grant: Academic project (gpu and storage)

#1
by piperod91 - opened
Serre Lab org
โ€ข
edited Aug 1, 2023

Using Artificial Intelligence to Identify Fossil Angiosperm Leaves at Family Level

The identification of fossil angiosperm leaves poses a well-known challenge for paleobotany. Recent progress in computer vision offers a path towards the development of AI agents to assist paleobotanists, but several challenges have slowed progress. Images of taxonomically vetted fossil leaves are very scarce in comparison to the need of AI for enormous visual training libraries, and their quality is highly variable due to preservation and taphonomic factors. To overcome these limitations, we have developed a deep generative model that learns to automatically synthesize photorealistic fossils from known cleared and x-ray extant-leaf Images. Given the considerable amount of unvetted images available in different collections (such as the Yale Peabody Museum Paleobotanical Collection and others), we have also managed to leverage the usage of unsupervised training, extending the possibility of generalization of our synthesizer model. We further use these synthetic fossils to train a deep neural network architecture to learn to classify both cleared leaves and fossil leaves at the family level. Using a leave-one-family-out cross-validation procedure to evaluate accuracy (whereby real leaf fossils are excluded from training for a test family), we report well above chance-level accuracy at family level for real fossil leaves, even when the system did not see any real fossils of that family during training. We report significantly above-chance classification accuracy in this scenario. Also, a study using explainability methods is carried out in order to identify some of the strategies used for the classification. Our results strongly suggest that AI methods will provide significant assistance to paleobotanists with the identification of leaf fossils. We are also shortly releasing a website for the community where it is possible to upload fossils from any part of the world and test first-hand the potential of our approach.

In order to run the models and make available for the community the GPU grant can help us run all the pieces in this project.
We are running:

  • SAM for segmenting the leaves.
  • Stable Diffussion Control Net to do domain adaptation.
  • Beit for classification.
andy-wyx pinned discussion

Sign up or log in to comment