farleyknight
/

patent-summarization-google-bigbird-pegasus-large-arxiv-2022-09-20

+---
+license: apache-2.0
+tags:
+- generated_from_trainer
+datasets:
+- big_patent
+metrics:
+- rouge
+model-index:
+- name: patent-summarization-google-bigbird-pegasus-large-arxiv-2022-09-20
+  results:
+  - task:
+      name: Sequence-to-sequence Language Modeling
+      type: text2text-generation
+    dataset:
+      name: big_patent
+      type: big_patent
+      config: all
+      split: train
+      args: all
+    metrics:
+    - name: Rouge1
+      type: rouge
+      value: 37.3992
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# patent-summarization-google-bigbird-pegasus-large-arxiv-2022-09-20
+This model is a fine-tuned version of [google/bigbird-pegasus-large-arxiv](https://huggingface.co/google/bigbird-pegasus-large-arxiv) on the big_patent dataset.
+It achieves the following results on the evaluation set:
+- Loss: 2.2618
+- Rouge1: 37.3992
+- Rouge2: 13.2731
+- Rougel: 26.0327
+- Rougelsum: 31.0338
+- Gen Len: 114.5195
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 1
+- eval_batch_size: 1
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 1.0
+### Training results
+| Training Loss | Epoch | Step  | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len  |
+|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:--------:|
+| 2.6121        | 0.08  | 5000  | 2.5652          | 35.0673 | 12.0073 | 24.5471 | 28.9315   | 119.9866 |
+| 2.5182        | 0.17  | 10000 | 2.4797          | 34.6909 | 11.6432 | 24.87   | 28.1543   | 119.2043 |
+| 2.5102        | 0.25  | 15000 | 2.4238          | 35.8574 | 12.2402 | 25.0712 | 29.5607   | 115.2890 |
+| 2.4292        | 0.33  | 20000 | 2.3869          | 36.0133 | 12.2453 | 25.4039 | 29.483    | 112.5920 |
+| 2.3678        | 0.41  | 25000 | 2.3594          | 35.238  | 11.6833 | 25.0449 | 28.3313   | 119.1739 |
+| 2.3511        | 0.5   | 30000 | 2.3326          | 36.7755 | 12.8394 | 25.7218 | 30.2594   | 110.5819 |
+| 2.3334        | 0.58  | 35000 | 2.3125          | 36.6317 | 12.7493 | 25.5388 | 30.094    | 115.5998 |
+| 2.3833        | 0.66  | 40000 | 2.2943          | 37.1219 | 13.1564 | 25.7571 | 30.8666   | 113.8222 |
+| 2.341         | 0.75  | 45000 | 2.2813          | 36.4962 | 12.6225 | 25.6904 | 29.9741   | 115.9845 |
+| 2.3179        | 0.83  | 50000 | 2.2725          | 37.3535 | 13.1596 | 25.7385 | 31.056    | 117.7754 |
+| 2.3164        | 0.91  | 55000 | 2.2654          | 36.9191 | 12.9316 | 25.7586 | 30.4691   | 116.1670 |
+| 2.3046        | 0.99  | 60000 | 2.2618          | 37.3992 | 13.2731 | 26.0327 | 31.0338   | 114.5195 |
+### Framework versions
+- Transformers 4.23.0.dev0
+- Pytorch 1.12.0
+- Datasets 2.4.0
+- Tokenizers 0.12.1