tiedaar commited on
Commit
bd65821
1 Parent(s): 8c54546

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -0
README.md ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - "en"
4
+ thumbnail: "url to a thumbnail used in social sharing"
5
+ tags:
6
+ - 'macroeconomics'
7
+ - 'automated summary evaluation'
8
+ - 'wording'
9
+ license: "apache-2.0"
10
+ metrics:
11
+ - 'mse'
12
+ ---
13
+
14
+ # Wording Model
15
+ This is a longformer model with a regression head designed to predict the wording score of a summary.
16
+ ## Corpus
17
+ It was trained on a corpus of 4,233 summaries of 101 sources compiled by Botarleanu et al. (2022).
18
+ The summaries were graded by expert raters on 6 criteria: Details, Main Point, Cohesion, Paraphrasing, Objective Language, and Language Beyond the Text.
19
+ A principle component analyis was used to reduce the dimensionality of the outcome variables to two.
20
+ * **Content** includes Details, Main Point, and Cohesion
21
+ * **Wording** includes Paraphrasing, Objective Language, and Language Beyond the Text
22
+
23
+ ## Score
24
+ This model predicts the Wording score. The model to predict the Content score can be found [here](https://huggingface.co/tiedaar/summary-longformer-content).
25
+ The following diagram illustrates the model architecture:
26
+
27
+ ![model diagram](model_diagram.png)
28
+
29
+ When providing input to the model, the summary and the source should be concatenated using the seperator token \</s>.
30
+ This allows the model to have access to both the summary and the source to provide more accurate scores. The model reported an R2 of 0.66 on the test set of summaries.