JulienRPA
/

BERT2BERT_finetuned

Text2Text Generation

encoder-decoder

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

BERT2BERT_finetuned / README.md

JulienRPA's picture

update model card README.md

4a213bd over 1 year ago

|

3.53 kB

	---
	tags:
	- generated_from_trainer
	metrics:
	- bleu
	model-index:
	- name: BERT2BERT_finetuned
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# BERT2BERT_finetuned

	This model is a fine-tuned version of [JulienRPA/BERT2BERT_pretrained_LC-QuAD_2.0](https://huggingface.co/JulienRPA/BERT2BERT_pretrained_LC-QuAD_2.0) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.3523
	- Bleu: 95.2821
	- Em: 0.1415
	- Rm: 0.3046
	- Gen Len: 58.7746

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 32
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 2000
	- num_epochs: 300.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Bleu \| Em \| Gen Len \| Validation Loss \| Rm \|
	\|:-------------:\|:------:\|:-----:\|:-------:\|:------:\|:-------:\|:---------------:\|:------:\|
	\| 4.0279 \| 12.82 \| 500 \| 49.841 \| 0.0 \| 51.6403 \| 2.4031 \| 0.0 \|
	\| 1.3442 \| 25.64 \| 1000 \| 85.0177 \| 0.0 \| 57.9784 \| 0.5014 \| 0.0 \|
	\| 0.2522 \| 38.46 \| 1500 \| 94.0714 \| 0.0168 \| 57.9137 \| 0.3293 \| 0.0216 \|
	\| 0.1534 \| 51.28 \| 2000 \| 94.4328 \| 0.0024 \| 58.9448 \| 0.3207 \| 0.0072 \|
	\| 0.1305 \| 64.1 \| 2500 \| 94.0708 \| 0.0 \| 59.6115 \| 0.3247 \| 0.0 \|
	\| 0.1226 \| 76.92 \| 3000 \| 94.3143 \| 0.0024 \| 58.235 \| 0.3325 \| 0.0024 \|
	\| 0.1131 \| 89.74 \| 3500 \| 94.5678 \| 0.0048 \| 59.6811 \| 0.3401 \| 0.0144 \|
	\| 0.1053 \| 102.56 \| 4000 \| 94.4738 \| 0.0168 \| 59.0288 \| 0.3374 \| 0.0552 \|
	\| 0.0999 \| 115.38 \| 4500 \| 94.6291 \| 0.0336 \| 58.6283 \| 0.3437 \| 0.0624 \|
	\| 0.0941 \| 128.21 \| 5000 \| 94.7896 \| 0.0695 \| 58.4149 \| 0.3512 \| 0.1271 \|
	\| 0.0904 \| 141.03 \| 5500 \| 94.4101 \| 0.0719 \| 58.2518 \| 0.3424 \| 0.1439 \|
	\| 0.0833 \| 153.85 \| 6000 \| 94.7141 \| 0.0887 \| 59.0312 \| 0.3462 \| 0.1775 \|
	\| 0.0772 \| 166.67 \| 6500 \| 94.6758 \| 0.0911 \| 59.0767 \| 0.3467 \| 0.2062 \|
	\| 0.0722 \| 179.49 \| 7000 \| 94.5698 \| 0.1055 \| 58.1415 \| 0.3462 \| 0.2398 \|
	\| 0.0669 \| 192.31 \| 7500 \| 95.0365 \| 0.1223 \| 58.7794 \| 0.3537 \| 0.2782 \|
	\| 0.062 \| 205.13 \| 8000 \| 94.8694 \| 0.1247 \| 58.211 \| 0.3505 \| 0.2686 \|
	\| 0.0576 \| 217.95 \| 8500 \| 94.8168 \| 0.1271 \| 59.0791 \| 0.3511 \| 0.2926 \|
	\| 0.0539 \| 230.77 \| 9000 \| 95.1935 \| 0.1367 \| 58.6787 \| 0.3490 \| 0.3046 \|
	\| 0.0502 \| 243.59 \| 9500 \| 95.1882 \| 0.1319 \| 58.5228 \| 0.3490 \| 0.3141 \|
	\| 0.0473 \| 256.41 \| 10000 \| 95.1198 \| 0.1319 \| 58.4245 \| 0.3504 \| 0.307 \|
	\| 0.045 \| 269.23 \| 10500 \| 0.3505 \| 95.047 \| 0.1343 \| 0.307 \| 58.3213\|
	\| 0.0429 \| 282.05 \| 11000 \| 0.3522 \| 95.2397\| 0.1391 \| 0.3046 \| 58.7242\|
	\| 0.0416 \| 294.87 \| 11500 \| 0.3523 \| 95.2821\| 0.1415 \| 0.3046 \| 58.7746\|


	### Framework versions

	- Transformers 4.30.0.dev0
	- Pytorch 2.0.1+cu118
	- Datasets 2.12.0
	- Tokenizers 0.13.3