add an example

058a3a7 verified 6 months ago

No virus

4.46 kB

	---
	tags:
	- generated_from_trainer
	metrics:
	- precision
	- recall
	- f1
	- accuracy
	model-index:
	- name: tetis-textmine-2024-camembert-large-based
	results: []
	widget:
	- text: À 8 M à l’ENE du phare de Nadji, le port de pêche de Sidi Abderrahmane (36° 29,7' N — 1° 05,7' E) est construit au bord du village de Soug el Bgar (pointe Rouge).
	example_title: Defi_TextMine

	---

	---
	license: cc-by-nc-4.0
	---
	# [TETIS](https://www.umr-tetis.fr) @ [Challenge TextMine 2024](https://textmine.sciencesconf.org/resource/page/id/9)

	---
	## This model is a NER based on Camembert-Large for the Kaggle Competition (in French): https://www.kaggle.com/competitions/defi-textmine-2024/

	This model could be re-use with HuggingFace transormers pipeline. To use it, please refer to its [Github](https://github.com/tetis-nlp/tetis-challenge_textmine_2024)
	---


	<img align="left" src="https://www.umr-tetis.fr/images/logo-header-tetis.png">

	\| Participants \|
	\|----------------------\|
	\| Rémy Decoupes \|
	\| Roberto Interdonato \|
	\| Rodrique Kafando \|
	\| Mehtab Syed Alam \|
	\| Maguelonne Teisseire \|
	\| Mathieu Roche \|
	\| Sarah Valentin \|

	---



	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# tetis-textmine-2024-camembert-large-based

	This model is a fine-tuned version of [camembert/camembert-large](https://huggingface.co/camembert/camembert-large) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.1106
	- Precision: 0.9327
	- Recall: 0.9471
	- F1: 0.9398
	- Accuracy: 0.9843
	- Aucun F1: 0.9434
	- Geogfeat F1: 0.9193
	- Geogfeat geogname F1: 0.9554
	- Geogname F1: 0.9133
	- Name geogname F1: 0.9519

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 10

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Precision \| Recall \| F1 \| Accuracy \| Aucun F1 \| Geogfeat F1 \| Geogfeat geogname F1 \| Geogname F1 \| Name geogname F1 \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:---------:\|:------:\|:------:\|:--------:\|:--------:\|:-----------:\|:--------------------:\|:-----------:\|:----------------:\|
	\| No log \| 1.0 \| 192 \| 0.1045 \| 0.9171 \| 0.9369 \| 0.9269 \| 0.9828 \| 0.9303 \| 0.8943 \| 0.9509 \| 0.9174 \| 0.9373 \|
	\| No log \| 2.0 \| 384 \| 0.1029 \| 0.9223 \| 0.9471 \| 0.9345 \| 0.9830 \| 0.9339 \| 0.9170 \| 0.9522 \| 0.9419 \| 0.9377 \|
	\| 0.0072 \| 3.0 \| 576 \| 0.0952 \| 0.9136 \| 0.9466 \| 0.9298 \| 0.9840 \| 0.9226 \| 0.8993 \| 0.9587 \| 0.9440 \| 0.9571 \|
	\| 0.0072 \| 4.0 \| 768 \| 0.1054 \| 0.9347 \| 0.9409 \| 0.9378 \| 0.9838 \| 0.9380 \| 0.9256 \| 0.9603 \| 0.9164 \| 0.9433 \|
	\| 0.0072 \| 5.0 \| 960 \| 0.1165 \| 0.9229 \| 0.9347 \| 0.9288 \| 0.9814 \| 0.9328 \| 0.9013 \| 0.9441 \| 0.9060 \| 0.9451 \|
	\| 0.0071 \| 6.0 \| 1152 \| 0.1070 \| 0.9306 \| 0.9462 \| 0.9383 \| 0.9840 \| 0.9419 \| 0.9144 \| 0.9487 \| 0.9213 \| 0.9533 \|
	\| 0.0071 \| 7.0 \| 1344 \| 0.1037 \| 0.9285 \| 0.9453 \| 0.9368 \| 0.9844 \| 0.9392 \| 0.9100 \| 0.9534 \| 0.9271 \| 0.9507 \|
	\| 0.0013 \| 8.0 \| 1536 \| 0.1127 \| 0.9335 \| 0.9475 \| 0.9405 \| 0.9841 \| 0.9451 \| 0.9175 \| 0.9505 \| 0.9222 \| 0.9520 \|
	\| 0.0013 \| 9.0 \| 1728 \| 0.1110 \| 0.9356 \| 0.9488 \| 0.9422 \| 0.9849 \| 0.9452 \| 0.9195 \| 0.9571 \| 0.9186 \| 0.9572 \|
	\| 0.0013 \| 10.0 \| 1920 \| 0.1106 \| 0.9327 \| 0.9471 \| 0.9398 \| 0.9843 \| 0.9434 \| 0.9193 \| 0.9554 \| 0.9133 \| 0.9519 \|


	### Framework versions

	- Transformers 4.30.2
	- Pytorch 2.0.1+cu117
	- Datasets 2.13.0
	- Tokenizers 0.13.3