Update README.md
Browse files
README.md
CHANGED
@@ -6,8 +6,8 @@ license: mit
|
|
6 |
An xlm-roberta-large model fine-tuned on all ~1,8 million annotated statements contained in the [manifesto corpus](https://manifesto-project.wzb.eu/information/documents/corpus) (version 2023a).
|
7 |
The model can be used to categorize any type of text into [56 different political categories](https://manifesto-project.wzb.eu/coding_schemes/mp_v4) according to the Manifesto Project's coding scheme (Handbook 4).
|
8 |
|
9 |
-
The context model variant additionally
|
10 |
-
During fine-tuning we collected the surrounding sentences of a statement and
|
11 |
We limited the statement itself to 100 tokens and the context of the statement to 200 tokens.
|
12 |
|
13 |
**Important**
|
|
|
6 |
An xlm-roberta-large model fine-tuned on all ~1,8 million annotated statements contained in the [manifesto corpus](https://manifesto-project.wzb.eu/information/documents/corpus) (version 2023a).
|
7 |
The model can be used to categorize any type of text into [56 different political categories](https://manifesto-project.wzb.eu/coding_schemes/mp_v4) according to the Manifesto Project's coding scheme (Handbook 4).
|
8 |
|
9 |
+
The context model variant additionally incorporates the surrounding sentences of a statement to improve the classification results for ambiguous sentences.
|
10 |
+
During fine-tuning we collected the surrounding sentences of a statement and merged them with the statement itself to provide the larger context of a sentence as the second part of a sentence pair input.
|
11 |
We limited the statement itself to 100 tokens and the context of the statement to 200 tokens.
|
12 |
|
13 |
**Important**
|