English
Summarization
5 papers
igorgavi commited on
Commit
534a51a
1 Parent(s): 824dfb9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -13
README.md CHANGED
@@ -39,19 +39,23 @@ and English.
39
 
40
  ## Model description
41
 
42
- This Automatic Text Summarizarion (ATS) Model was developed to be applied to the Research Financing Products Portfolio (FPP)
43
- of the Brazilian Ministry of Science, Technology and Innovation. It was produced in parallel with the writing of a Sistematic
44
- Literature Review paper, in which there is a discussion concerning many summarization methods, datasets, and evaluators as well
45
- as a brief overview of the nature of the task itself and the state-of-the-art of its implementation.
46
-
47
- The input of the model can be either a single text or a csv file containing multiple texts (in the English language) and its output are the summarized texts
48
- and their evaluation metrics. As an optional (although recommended) input, the model accepts gold-standard summaries for the texts,
49
- i.e., human written (or extracted) summaries of the texts which are considered to be good representations of their contents. Evaluators
50
- like ROUGE, which in its many variations is the most used to perform the task, require gold-standard summaries as inputs. There are, however,
51
- Evaluation Methods which do not deppend on the existence of a golden-summary (e.g. the cosine similarity method, the Kullback Leibler Divergence method)
52
- and this is why an evaluation can be made even when only the text is taken as an input to the model.
53
-
54
-
 
 
 
 
55
 
56
 
57
 
 
39
 
40
  ## Model description
41
 
42
+ This Automatic Text Summarizarion (ATS) Model was developed in the Python language to be applied to the Research Financing Products
43
+ Portfolio (FPP) of the Brazilian Ministry of Science, Technology and Innovation. It was produced in parallel with the writing of a
44
+ Sistematic Literature Review paper, in which there is a discussion concerning many summarization methods, datasets, and evaluators
45
+ as well as a brief overview of the nature of the task itself and the state-of-the-art of its implementation.
46
+
47
+ The input of the model can be either a single text, a dataframe or a csv file containing multiple texts (in the English language) and its output
48
+ are the summarized texts and their evaluation metrics. As an optional (although recommended) input, the model accepts gold-standard summaries
49
+ for the texts, i.e., human written (or extracted) summaries of the texts which are considered to be good representations of their contents.
50
+ Evaluators like ROUGE, which in its many variations is the most used to perform the task, require gold-standard summaries as inputs. There are,
51
+ however, Evaluation Methods which do not deppend on the existence of a golden-summary (e.g. the cosine similarity method, the Kullback Leibler
52
+ Divergence method) and this is why an evaluation can be made even when only the text is taken as an input to the model.
53
+
54
+ The text output is produced by a chosen method of ATS which can be extractive (built with the most relevant sentences of the source document)
55
+ or abstractive (written from scratch in an abstractive manner). The latter is achieved by means of transformers, and the ones present in the
56
+ model are the already existing and vastly applied BART-Large CNN, Pegasus-XSUM and mT5 Multilingual XLSUM. The extractive methods are taken from
57
+ the Sumy Python Library and include SumyRandom, SumyLuhn, SumyLsa, SumyLexRank, SumyTextRank, SumySumBasic, SumyKL and SumyReduction. Each of the
58
+ methods used for text summarization will be described indvidually in the following sections.
59
 
60
 
61