poltextlab
/

xlm-roberta-large-german-cap

@@ -10,7 +10,7 @@ metrics:
 - accuracy
 - f1-score
 ---
-# MODEL_NAME
 ## Model description
 An `xlm-roberta-large` model finetuned on training data containing [major topic codes](https://www.comparativeagendas.net/pages/master-codebook) from the [Comparative Agendas Project](https://www.comparativeagendas.net/).
@@ -44,7 +44,7 @@ dataset = hg_data.map(tokenize_dataset, batched=True, remove_columns=hg_data.col
 #### Inference using the Trainer class
 ```python
-model = AutoModelForSequenceClassification.from_pretrained('poltextlab/MODEL_NAME',
                                                            num_labels=num_labels,
                                                            problem_type="multi_label_classification",
                                                            ignore_mismatched_sizes=True
@@ -52,8 +52,8 @@ model = AutoModelForSequenceClassification.from_pretrained('poltextlab/MODEL_NAM
 training_args = TrainingArguments(
     output_dir='.',
-    per_device_train_batch_size=BATCH,
-    per_device_eval_batch_size=BATCH
 )
 trainer = Trainer(
@@ -68,7 +68,7 @@ predicted = pd.DataFrame(np.argmax(probs, axis=1)).replace({0: CAP_NUM_DICT}).re
 ```
 ### Fine-tuning procedure
-`MODEL_NAME` was fine-tuned using the Hugging Face Trainer class with the following hyperparameters:
 ```
 training_args = TrainingArguments(
     output_dir=f"../model/{model_dir}/tmp/",
@@ -88,9 +88,34 @@ training_args = TrainingArguments(
 We also incorporated an EarlyStoppingCallback in the process with a patience of 2 epochs.
 ## Model performance
-The model was evaluated on a test set of NUM_TEST_SET examples (10% of the available data).
 Model accuracy is **0.83**.
-METRICS_TABLE
 ## Inference platform
 This model is used by the [CAP Babel Machine](https://babel.poltextlab.com), an open-source and free natural language processing tool, designed to simplify and speed up projects for comparative research.

 - accuracy
 - f1-score
 ---
+# xlm-roberta-large-german-cap
 ## Model description
 An `xlm-roberta-large` model finetuned on training data containing [major topic codes](https://www.comparativeagendas.net/pages/master-codebook) from the [Comparative Agendas Project](https://www.comparativeagendas.net/).
 #### Inference using the Trainer class
 ```python
+model = AutoModelForSequenceClassification.from_pretrained('poltextlab/xlm-roberta-large-german-cap',
                                                            num_labels=num_labels,
                                                            problem_type="multi_label_classification",
                                                            ignore_mismatched_sizes=True
 training_args = TrainingArguments(
     output_dir='.',
+    per_device_train_batch_size=8,
+    per_device_eval_batch_size=8
 )
 trainer = Trainer(
 ```
 ### Fine-tuning procedure
+`xlm-roberta-large-german-cap` was fine-tuned using the Hugging Face Trainer class with the following hyperparameters:
 ```
 training_args = TrainingArguments(
     output_dir=f"../model/{model_dir}/tmp/",
 We also incorporated an EarlyStoppingCallback in the process with a patience of 2 epochs.
 ## Model performance
+The model was evaluated on a test set of 6309 examples (10% of the available data).
 Model accuracy is **0.83**.
+| label        |   precision |   recall |   f1-score |   support |
+|:-------------|------------:|---------:|-----------:|----------:|
+| 0            |        0.65 |     0.6  |       0.62 |       621 |
+| 1            |        0.71 |     0.68 |       0.69 |       473 |
+| 2            |        0.79 |     0.73 |       0.76 |       247 |
+| 3            |        0.77 |     0.71 |       0.74 |       156 |
+| 4            |        0.68 |     0.58 |       0.63 |       383 |
+| 5            |        0.79 |     0.82 |       0.8  |       351 |
+| 6            |        0.71 |     0.78 |       0.74 |       329 |
+| 7            |        0.81 |     0.79 |       0.8  |       216 |
+| 8            |        0.78 |     0.75 |       0.76 |       157 |
+| 9            |        0.87 |     0.78 |       0.83 |       272 |
+| 10           |        0.61 |     0.68 |       0.64 |       315 |
+| 11           |        0.61 |     0.74 |       0.67 |       487 |
+| 12           |        0.72 |     0.7  |       0.71 |       145 |
+| 13           |        0.69 |     0.6  |       0.64 |       346 |
+| 14           |        0.75 |     0.69 |       0.72 |       359 |
+| 15           |        0.69 |     0.65 |       0.67 |       189 |
+| 16           |        0.36 |     0.47 |       0.41 |        55 |
+| 17           |        0.68 |     0.73 |       0.71 |       618 |
+| 18           |        0.61 |     0.68 |       0.64 |       469 |
+| 19           |        0    |     0    |       0    |        18 |
+| 20           |        0.73 |     0.75 |       0.74 |       102 |
+| 21           |        0    |     0    |       0    |         1 |
+| macro avg    |        0.64 |     0.63 |       0.63 |      6309 |
+| weighted avg |        0.7  |     0.69 |       0.69 |      6309 |
 ## Inference platform
 This model is used by the [CAP Babel Machine](https://babel.poltextlab.com), an open-source and free natural language processing tool, designed to simplify and speed up projects for comparative research.