t5-small-squad-qg-a2c-spt
This model is a fine-tuned version of lmqg/t5-small-squad-qg on the qg_squadshifts dataset. It achieves the following results on the evaluation set:
- Loss: 3.4424
- Bleu: 0.2369
- Precisions: [0.5087032407189018, 0.27403783600926385, 0.18636099825885083, 0.1320389623167492]
- Brevity Penalty: 0.9790
- Length Ratio: 0.9793
- Translation Length: 42398
- Reference Length: 43296
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
- label_smoothing_factor: 0.15
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length |
---|---|---|---|---|---|---|---|---|---|
3.646 | 1.0 | 42 | 3.4501 | 0.2324 | [0.5031057356491574, 0.26807773815061764, 0.18061915046796256, 0.12696709585121602] | 0.9853 | 0.9854 | 42663 | 43296 |
3.5951 | 2.0 | 84 | 3.4456 | 0.2328 | [0.5061518076850274, 0.27000913957435696, 0.1832138903455107, 0.12926178476134007] | 0.9759 | 0.9762 | 42264 | 43296 |
3.5572 | 3.0 | 126 | 3.4427 | 0.2355 | [0.505242954779515, 0.27049412978970455, 0.18334962341171734, 0.12953889087192133] | 0.9867 | 0.9868 | 42724 | 43296 |
3.5295 | 4.0 | 168 | 3.4411 | 0.2351 | [0.5057055646865461, 0.27130317702804174, 0.1838566316518527, 0.12948538278525568] | 0.9836 | 0.9837 | 42590 | 43296 |
3.4945 | 5.0 | 210 | 3.4418 | 0.2359 | [0.5068653913859875, 0.27228491562273527, 0.18446938010211442, 0.1297804417225878] | 0.9839 | 0.9840 | 42605 | 43296 |
3.4771 | 6.0 | 252 | 3.4432 | 0.2375 | [0.507522591245159, 0.2735272802567554, 0.18594051980269422, 0.13157208938693074] | 0.9839 | 0.9840 | 42605 | 43296 |
3.46 | 7.0 | 294 | 3.4431 | 0.2377 | [0.5092926294961487, 0.2746595987943041, 0.1869911632623497, 0.13212859294179272] | 0.9803 | 0.9805 | 42453 | 43296 |
3.4656 | 8.0 | 336 | 3.4413 | 0.2368 | [0.5082384555547698, 0.2738076663025953, 0.18616789908655937, 0.1317669419321012] | 0.9796 | 0.9798 | 42423 | 43296 |
3.443 | 9.0 | 378 | 3.4425 | 0.2373 | [0.5089378360532025, 0.27438532587485365, 0.18661869668658967, 0.13227530576778043] | 0.9792 | 0.9794 | 42404 | 43296 |
3.4455 | 10.0 | 420 | 3.4424 | 0.2369 | [0.5087032407189018, 0.27403783600926385, 0.18636099825885083, 0.1320389623167492] | 0.9790 | 0.9793 | 42398 | 43296 |
Framework versions
- Transformers 4.27.4
- Pytorch 1.9.0
- Datasets 2.9.0
- Tokenizers 0.13.2
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.