e22vvb
/

ALL_mt5-base_15_spider_no_sch_15_wikiSQL_no_sch

+---
+tags:
+- generated_from_trainer
+model-index:
+- name: ALL_mt5-base_15_spider_no_sch_15_wikiSQL_no_sch
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# ALL_mt5-base_15_spider_no_sch_15_wikiSQL_no_sch
+This model was trained from scratch on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.0267
+- Rouge2 Precision: 0.8214
+- Rouge2 Recall: 0.551
+- Rouge2 Fmeasure: 0.6267
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 15
+- eval_batch_size: 16
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 15
+### Training results
+| Training Loss | Epoch | Step  | Validation Loss | Rouge2 Precision | Rouge2 Recall | Rouge2 Fmeasure |
+|:-------------:|:-----:|:-----:|:---------------:|:----------------:|:-------------:|:---------------:|
+| 0.3337        | 1.0   | 1293  | 0.1946          | 0.4739           | 0.3026        | 0.3469          |
+| 0.1986        | 2.0   | 2586  | 0.1295          | 0.5591           | 0.3707        | 0.4203          |
+| 0.1594        | 3.0   | 3879  | 0.0988          | 0.6081           | 0.4048        | 0.4587          |
+| 0.1248        | 4.0   | 5172  | 0.0784          | 0.6641           | 0.446         | 0.5058          |
+| 0.1085        | 5.0   | 6465  | 0.0639          | 0.6994           | 0.4687        | 0.5318          |
+| 0.092         | 6.0   | 7758  | 0.0548          | 0.7211           | 0.4841        | 0.5496          |
+| 0.0807        | 7.0   | 9051  | 0.0472          | 0.7398           | 0.4954        | 0.5633          |
+| 0.0739        | 8.0   | 10344 | 0.0419          | 0.7611           | 0.5104        | 0.5801          |
+| 0.0671        | 9.0   | 11637 | 0.0368          | 0.7779           | 0.5215        | 0.5926          |
+| 0.0629        | 10.0  | 12930 | 0.0336          | 0.8019           | 0.5391        | 0.6123          |
+| 0.0587        | 11.0  | 14223 | 0.0309          | 0.7974           | 0.5358        | 0.6087          |
+| 0.0557        | 12.0  | 15516 | 0.0289          | 0.809            | 0.5446        | 0.6186          |
+| 0.0532        | 13.0  | 16809 | 0.0278          | 0.8203           | 0.5502        | 0.6259          |
+| 0.051         | 14.0  | 18102 | 0.0270          | 0.8202           | 0.5501        | 0.6258          |
+| 0.0504        | 15.0  | 19395 | 0.0267          | 0.8214           | 0.551         | 0.6267          |
+### Framework versions
+- Transformers 4.38.2
+- Pytorch 2.2.0
+- Datasets 2.16.1
+- Tokenizers 0.15.1

generation_config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "decoder_start_token_id": 0,
+  "eos_token_id": 1,
+  "pad_token_id": 0,
+  "transformers_version": "4.38.2"
+}