esuriddick commited on
Commit
9ce5f5b
1 Parent(s): 5e35bbc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -5
README.md CHANGED
@@ -18,17 +18,19 @@ This model is a fine-tuned version of [allenai/led-base-16384](https://huggingfa
18
  It achieves the following results on the evaluation set:
19
  - Loss: 1.2887
20
 
 
 
21
  ## Model description
22
 
23
- More information needed
24
 
25
- ## Intended uses & limitations
26
 
27
- More information needed
28
 
29
- ## Training and evaluation data
30
 
31
- More information needed
32
 
33
  ## Training procedure
34
 
 
18
  It achieves the following results on the evaluation set:
19
  - Loss: 1.2887
20
 
21
+ The amount of processing time and memory required to assess the ROUGE metrics on the validation and test sets were not supported by Kaggle at this moment in time.
22
+
23
  ## Model description
24
 
25
+ As described in [Longformer: The Long-Document Transformer](https://arxiv.org/pdf/2004.05150.pdf) by Iz Beltagy, Matthew E. Peters, Arman Cohan, [Allenai's Longformer Encoder-Decoder (LED)](https://github.com/allenai/longformer#longformer) was initialized from [*bart-base*](https://huggingface.co/facebook/bart-base) since both models share the exact same architecture. To be able to process 16K tokens, *bart-base*'s position embedding matrix was simply copied 16 times.
26
 
27
+ This model is especially interesting for long-range summarization and question answering.
28
 
29
+ ## Intended uses & limitations
30
 
31
+ [pszemraj/govreport-summarization-8192](https://huggingface.co/datasets/pszemraj/govreport-summarization-8192) is a pre-processed version of the dataset [ccdv/govreport-summarization](https://huggingface.co/datasets/ccdv/govreport-summarization), which is a dataset for summarization of long documents adapted from this [repository](https://github.com/luyang-huang96/LongDocSum) and this [paper](https://arxiv.org/pdf/2104.02112.pdf).
32
 
33
+ The Allenai's LED model was fine-tuned to this dataset, allowing the summarization of documents up to 16384 tokens.
34
 
35
  ## Training procedure
36