--- tags: - generated_from_trainer - summarization - book summary metrics: - rouge dataset: - kmfoda/booksum model-index: - name: long-t5-tglobal-large-booksum-WIP results: - task: type: summarization name: Summarization dataset: name: kmfoda/booksum type: kmfoda/booksum config: kmfoda--booksum split: test metrics: - type: rouge value: 25.6136 name: ROUGE-1 verified: true verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiY2E3ZWI5NjRiZGE3YTQ2YTg5MGNmNzI5NTdjN2U3OTNiNzhmMjBhMDVkZjcwZjg0MTEyMTM3MzQyZmI1NzNjYSIsInZlcnNpb24iOjF9.REYAFwePFucxAn1Twsh9BSov9KPsCML9nTjL9oIIWa3Hp8DwJ_syPmfNsYxGe2vvNVq5rzBKF9gsJW80pbo-Aw - type: rouge value: 2.8652 name: ROUGE-2 verified: true verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMzk5Mjg0ZmRjYzg1NjM4MGMwOWYyOTM0ZDU2OTM2ZGJlYmM0OTVjNTI2NzcyMzU0MGI0M2I0ZmE0ZmY2NmRlNSIsInZlcnNpb24iOjF9.MzKSIqRjIV6V5YMYlvbRt2ca_CR5WFZ8DqOrUvDbiSyh7qbdU6F2LdDjB6eL-wzIR_DMF10sTtoF7H7wXs2GDw - type: rouge value: 12.4913 name: ROUGE-L verified: true verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZDMzMDZhYzg2N2Q0YTZiYWUzOGI2MTRjMmRlNGIzY2I0ZDU3YzQ1MWVkZDlkOTQzNDlhNjk1MWM2OWUwNDczYSIsInZlcnNpb24iOjF9.TysgYlvfe-4GJWDSFg8KQ97Bsu-kDX3VDamS6bi9q_60V3mBzIOz0M0slySuHXu5S4MJ8a0OCPWvskP0T4ZmCQ - type: rouge value: 23.1102 name: ROUGE-LSUM verified: true verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMzY3NmI2MDJkZTQ2MzMwMDg2NWZmM2Q5NjNmZTRkMTJiODViODZmODYyNTgwMzBkYzBmZDRmMWNjYjg5NjBkYSIsInZlcnNpb24iOjF9.XNvINLow-1mfiDbm_YcAM_l4c-gEV_V5oLKzBWh7Hdmi9gHP_Z86fqQn9Kj2nhOPFWcUOFUBIzx4Z0rjs162BA - type: loss value: 5.004334926605225 name: loss verified: true verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiODNjMzI5N2IwNDExOWQzMWYxMzE4YzkxYWYxZmRkNTA2NWQ1MmYzOTFjODJhNGUzODQxYmNkODBlZDA0MGNmZCIsInZlcnNpb24iOjF9.xGNlloXeHra0K5DTKXbsrrkyuAvFXZwjzkxOyjtpw2jWs0KPw4nQ1MKkJiX6juXtleJrvS2u1FQcwCbygUmLDQ - type: gen_len value: 89.4354 name: gen_len verified: true verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYjBlODBiMmEwN2UzYzE5NTE3ODBkNDVmMTgxMzhlYmVmZjgxMzJjYTBlYjBhMDgzNzhhMWQ0Mzc2MjdjN2E0ZiIsInZlcnNpb24iOjF9.Z9kytQDiNK-TCaHz-0YZeH8FCrW5D0SA-ji7Q86wqdhBC9jTDmJGnBll6mGFcHipERrRKZb12hYStKJanb3iBA --- # tglobal-large-booksum-WIP > this is a WIP checkpoint that has been fine-tuned from the vanilla (original) for 10ish epochs. It is **not ready to be used for inference** This model is a fine-tuned version of [google/long-t5-tglobal-large](https://huggingface.co/google/long-t5-tglobal-large) on the `kmfoda/booksum` dataset. It achieves the following results on the evaluation set: - Loss: 4.9519 - Rouge1: 21.8058 - Rouge2: 2.9343 - Rougel: 10.3717 - Rougelsum: 20.1537 - Gen Len: 106.055 ## Model description Testing fine-tuning only on booksum with 16384/1024 the whole time (vs. previous large WIP checkpoint I made that started from a partially-trained `pubmed` checkpoint) ## Intended uses & limitations this is a WIP checkpoint that has been fine-tuned from the vanilla (original) for 10ish epochs. It is **not ready to be used for inference** ## Training and evaluation data This is **only** fine-tuned on booksum (vs. previous large WIP checkpoint I made that started from a partially-trained `pubmed` checkpoint) ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0004 - train_batch_size: 1 - eval_batch_size: 1 - seed: 31060 - distributed_type: multi-GPU - num_devices: 4 - gradient_accumulation_steps: 32 - total_train_batch_size: 128 - total_eval_batch_size: 4 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - num_epochs: 3.0 ### Training results | Training Loss | Epoch | Step | Gen Len | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | |:-------------:|:-----:|:----:|:-------:|:---------------:|:-------:|:------:|:-------:|:---------:| | 5.0389 | 0.99 | 37 | 219.03 | 5.1884 | 29.995 | 4.4045 | 12.8837 | 27.557 | | 4.8986 | 1.0 | 75 | 5.1286 | 26.921 | 3.7193 | 11.3605| 25.3492 | 276.005 | | 4.5928 | 2.0 | 150 | 4.9900 | 26.6667 | 3.7342 | 11.8223| 24.7087 | 178.775 | | 4.6159 | 3.0 | 225 | 4.9519 | 21.8058 | 2.9343 | 10.3717| 20.1537 | 106.055 | #### eval in bf16 ``` ***** eval metrics ***** epoch = 3.0 eval_gen_len = 103.075 eval_loss = 4.9501 eval_rouge1 = 21.6345 eval_rouge2 = 2.877 eval_rougeL = 10.386 eval_rougeLsum = 20.0148 eval_runtime = 0:06:02.75 eval_samples = 200 eval_samples_per_second = 0.551 eval_steps_per_second = 0.138 [INFO|trainer.py:2724] 2022-11-27 01:00: ``` ### Framework versions - Transformers 4.25.0.dev0 - Pytorch 1.13.0+cu117 - Datasets 2.6.1 - Tokenizers 0.13.1