JulienRPA
/

BERT2BERT_finetuned

Text2Text Generation

encoder-decoder

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

BERT2BERT_finetuned / README.md

JulienRPA's picture

update model card README.md

88ff338 about 1 year ago

|

raw history blame contribute delete

No virus

3.53 kB

	---
	tags:
	- generated_from_trainer
	metrics:
	- bleu
	model-index:
	- name: BERT2BERT_finetuned
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# BERT2BERT_finetuned

	This model is a fine-tuned version of [JulienRPA/BERT2BERT_pretrained_LC-QuAD_2.0](https://huggingface.co/JulienRPA/BERT2BERT_pretrained_LC-QuAD_2.0) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.1672
	- Bleu: 96.7679
	- Em: 0.6307
	- Rm: 0.7482
	- Gen Len: 75.6355

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 32
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 2000
	- num_epochs: 300.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Bleu \| Em \| Gen Len \| Validation Loss \| Rm \|
	\|:-------------:\|:------:\|:-----:\|:-------:\|:------:\|:-------:\|:---------------:\|:------:\|
	\| 3.4354 \| 12.82 \| 500 \| 56.6427 \| 0.0 \| 70.5947 \| 1.5065 \| 0.0 \|
	\| 0.8473 \| 25.64 \| 1000 \| 90.5419 \| 0.0192 \| 76.9736 \| 0.3859 \| 0.0216 \|
	\| 0.2049 \| 38.46 \| 1500 \| 93.6495 \| 0.0504 \| 75.1655 \| 0.2472 \| 0.0671 \|
	\| 0.1222 \| 51.28 \| 2000 \| 93.8388 \| 0.0959 \| 75.6403 \| 0.2338 \| 0.1487 \|
	\| 0.0923 \| 64.1 \| 2500 \| 94.71 \| 0.2158 \| 75.8177 \| 0.1944 \| 0.2662 \|
	\| 0.0752 \| 76.92 \| 3000 \| 95.0458 \| 0.2662 \| 75.2638 \| 0.1990 \| 0.3022 \|
	\| 0.0627 \| 89.74 \| 3500 \| 95.3518 \| 0.3429 \| 76.9928 \| 0.1957 \| 0.3957 \|
	\| 0.052 \| 102.56 \| 4000 \| 95.5392 \| 0.3837 \| 76.1007 \| 0.1861 \| 0.4508 \|
	\| 0.0457 \| 115.38 \| 4500 \| 95.6692 \| 0.4173 \| 76.1727 \| 0.1880 \| 0.4892 \|
	\| 0.0386 \| 128.21 \| 5000 \| 95.9215 \| 0.446 \| 76.0168 \| 0.1850 \| 0.5276 \|
	\| 0.0321 \| 141.03 \| 5500 \| 95.931 \| 0.4964 \| 75.2566 \| 0.1724 \| 0.5875 \|
	\| 0.026 \| 153.85 \| 6000 \| 96.4317 \| 0.5348 \| 75.741 \| 0.1687 \| 0.6499 \|
	\| 0.0242 \| 166.67 \| 6500 \| 96.197 \| 0.5372 \| 76.1127 \| 0.1707 \| 0.6403 \|
	\| 0.0193 \| 179.49 \| 7000 \| 96.3422 \| 0.5564 \| 75.3933 \| 0.1643 \| 0.6691 \|
	\| 0.0164 \| 192.31 \| 7500 \| 96.5278 \| 0.5779 \| 75.4508 \| 0.1650 \| 0.693 \|
	\| 0.0139 \| 205.13 \| 8000 \| 96.6382 \| 0.6091 \| 75.9592 \| 0.1668 \| 0.7314 \|
	\| 0.012 \| 217.95 \| 8500 \| 96.5488 \| 0.6163 \| 76.0024 \| 0.1644 \| 0.729 \|
	\| 0.0106 \| 230.77 \| 9000 \| 96.6353 \| 0.6091 \| 75.5468 \| 0.1653 \| 0.7266 \|
	\| 0.0093 \| 243.59 \| 9500 \| 96.8984 \| 0.6331 \| 75.7242 \| 0.1663 \| 0.7482 \|
	\| 0.0084 \| 256.41 \| 10000 \| 96.6199 \| 0.6331 \| 75.3885 \| 0.1676 \| 0.7482 \|
	\| 0.0076 \| 269.23 \| 10500 \| 0.1678 \| 96.5038\| 0.6283 \| 0.7482 \| 75.3453\|
	\| 0.007 \| 282.05 \| 11000 \| 0.1669 \| 96.7187\| 0.6355 \| 0.7458 \| 75.9281\|
	\| 0.0065 \| 294.87 \| 11500 \| 0.1672 \| 96.7679\| 0.6307 \| 0.7482 \| 75.6355\|


	### Framework versions

	- Transformers 4.30.0.dev0
	- Pytorch 2.0.1+cu118
	- Datasets 2.12.0
	- Tokenizers 0.13.3