--- language: en tags: - summarization license: mit model-index: - name: SamuelAllen123/t5-efficient-large-nl36_fine_tune_sum_V2 results: - task: type: summarization name: Summarization dataset: name: samsum type: samsum config: samsum split: test metrics: - name: ROUGE-1 type: rouge value: 50.5049 verified: true - name: ROUGE-2 type: rouge value: 25.6469 verified: true - name: ROUGE-L type: rouge value: 41.7544 verified: true - name: ROUGE-LSUM type: rouge value: 46.2055 verified: true - name: loss type: loss value: 1.5158178806304932 verified: true - name: gen_len type: gen_len value: 24.0342 verified: true - task: type: summarization name: Summarization dataset: name: cnn_dailymail type: cnn_dailymail config: 3.0.0 split: test metrics: - name: ROUGE-1 type: rouge value: 34.4055 verified: true - name: ROUGE-2 type: rouge value: 14.127 verified: true - name: ROUGE-L type: rouge value: 24.3353 verified: true - name: ROUGE-LSUM type: rouge value: 31.6582 verified: true - name: loss type: loss value: 2.4456119537353516 verified: true - name: gen_len type: gen_len value: 45.928 verified: true - task: type: summarization name: Summarization dataset: name: samsum type: samsum config: samsum split: train metrics: - name: ROUGE-1 type: rouge value: 54.933 verified: true - name: ROUGE-2 type: rouge value: 31.7965 verified: true - name: ROUGE-L type: rouge value: 47.0057 verified: true - name: ROUGE-LSUM type: rouge value: 51.2027 verified: true - name: loss type: loss value: 1.130684494972229 verified: true - name: gen_len type: gen_len value: 23.7989 verified: true --- Summarize without adding summarize to the start of the string. Trained on Samsum train split. Parameters for training: no_decay = ["bias", "LayerNorm.weight", "layer_norm.weight"] optimizer_grouped_parameters = [ { "params": [p for n, p in model.named_parameters() if not any(nd in n for nd in no_decay)], "weight_decay": 0.0, }, { "params": [p for n, p in model.named_parameters() if any(nd in n for nd in no_decay)], "weight_decay": 0.0, }, ] lr = 0.00005 optimizer = torch.optim.RAdam(optimizer_grouped_parameters, lr=lr) lr_scheduler = get_scheduler( name="linear", optimizer=optimizer, num_warmup_steps=0, num_training_steps=50005) This was only for 10K steps with a batch size of 10 If you want more info, feel free to message me or email me at: samuelfipps@gmail.com