|
--- |
|
title: README |
|
sdk: static |
|
pinned: true |
|
--- |
|
Welcome to the space of ChouBERT, a French language model for plant health text mining. |
|
|
|
On l'appelle ChouBERT parce qu'il est fait pour surveiller les végétaux comme le chou, il gère bien les polysémies comme "chou" ou "chouchou" et il est chou. |
|
|
|
We further pre-trained CamemBERT base model on French plant health bulletins and Tweets to build ChouBERT. |
|
|
|
ChouBERT-n are pre-trained for n epochs with MLM. You may use these models if you want to reproduce our experiments. |
|
|
|
ChouBERT-n-plant-health-ner are fine-tuned ChouBERT-n for Named Entity Recongnition (NER) in plant health domain. The NER paper: <https://hal.science/hal-04245168/>. |
|
|
|
ChouBERT-n-plant-health-tweet-classifier are fine-tuned ChouBERT-n for distinguishing tweets about Plant Health observation from other tweets. We describe how we build ChouBRET in this paper: <https://hal.archives-ouvertes.fr/hal-03621123> |
|
|
|
Our work shows that ChouBERT-16 and ChouBERT-32-based classifiers are the most generalizable for recognizing unseen hazards, especially polysemous terms. |
|
We also upload the CamemBERT-based classifiers as the baseline. |
|
|
|
Listen to a song of ChouBERT (generated with sumo): <https://suno.com/song/acb11b86-5433-4d97-8e68-e44aefc66a99> |
|
|
|
### BibTeX entries |
|
|
|
```bibtex |
|
@inproceedings{jiang2022choubert, |
|
title={ChouBERT: Pre-training French Language Model for Crowdsensing with Tweets in Phytosanitary Context}, |
|
author={Jiang, Shufan and Angarita, Rafael and Cormier, St{\'e}phane and Orensanz, Julien and Rousseaux, Francis}, |
|
booktitle={International Conference on Research Challenges in Information Science}, |
|
pages={653--661}, |
|
year={2022}, |
|
organization={Springer} |
|
} |
|
|
|
@inproceedings{jiang2022ner, |
|
title = {{Named Entity Recognition for Monitoring Plant Health Threats in Tweets: a ChouBERT Approach}}, |
|
author = {Jiang, Shufan and Angarita, Rafael and Cormier, St{\'e}phane and Rousseaux, Francis}, |
|
booktitle = {{2022 6th International Conference on Universal Village (UV)}}, |
|
address = {Boston, United States}, |
|
publisher = {{IEEE}}, |
|
year = {2022}, |
|
doi = {10.1109/UV56588.2022.10185492}, |
|
} |
|
|
|
``` |