---
license: mit
language:
- 'no'
- nn
- da
- sv
- en
---
# Scandinavian Education Classifier Snowflake

## !!! We recomment using [our bert-based model](https://huggingface.co/north/scandinavian_education_classifier_bert) instead for production
Trained using code from: [CosmoPedia)[]https://github.com/huggingface/cosmopedia/tree/main/classification], and the [nb-bert-base](https://huggingface.co/Snowflake/snowflake-arctic-embed-m) as starting point. The [data](https://huggingface.co/datasets/north/scandinavian-llama3-annotations) used in classification is from [GlotCC](https://huggingface.co/datasets/cis-lmu/GlotCC-V1) and have been annotated using Gemini 1.5 Flash.

The following command where used for training:
```
 python train_edu_bert.py --base_model_name="NbAiLab/nb-bert-base" --dataset_name="north/scandinavian-educational-annotations" --target_column="score" --checkpoint_dir="/home/pere/checkpoints/scandinavian_bert/"
```

## Classification Report

| Class     | Precision | Recall | F1-Score | Support |
|-----------|-----------|--------|----------|---------|
| 0         | 0.76      | 0.64   | 0.70     | 18274   |
| 1         | 0.63      | 0.76   | 0.69     | 23348   |
| 2         | 0.48      | 0.40   | 0.43     | 6621    |
| 3         | 0.57      | 0.28   | 0.38     | 1314    |
| 4         | 0.56      | 0.06   | 0.12     | 433     |
| 5         | 0.00      | 0.00   | 0.00     | 10      |

| Metric           | Value |
|------------------|-------|
| Accuracy         | 0.65  |
| Macro Avg        |       |
| - Precision      | 0.50  |
| - Recall         | 0.36  |
| - F1-Score       | 0.38  |
| Weighted Avg     |       |
| - Precision      | 0.65  |
| - Recall         | 0.65  |
| - F1-Score       | 0.64  |
| Total Support    | 50000 |


## Confusion Matrix

|       | Class 0 | Class 1 | Class 2 | Class 3 | Class 4 | Class 5 |
|-------|---------|---------|---------|---------|---------|---------|
| Class 0 | 11725  | 6460    | 88      | 1       | 0       | 0       |
| Class 1 | 3598   | 17758   | 1978    | 14      | 0       | 0       |
| Class 2 | 128    | 3733    | 2618    | 142     | 0       | 0       |
| Class 3 | 6      | 272     | 645     | 369     | 22      | 0       |
| Class 4 | 2      | 121     | 161     | 121     | 28      | 0       |
| Class 5 | 0      | 2       | 8       | 0       | 0       | 0       |


## Evaluation Metrics

| Metric                   | Value                           |
|--------------------------|---------------------------------|
| Eval Loss                | 0.3311704695224762              |
| Eval Precision           | 0.49857140934204414             |
| Eval Recall              | 0.35718277242555724             |
| Eval F1 Macro            | 0.38442290605864393             |
| Eval Accuracy            | 0.64996                         |
| Eval Runtime             | 86.1773                         |
| Eval Samples Per Second  | 580.199                         |
| Eval Steps Per Second    | 4.537                           |
| Epoch                    | 19.91                           |

## Training Metrics

| Metric                   | Value                           |
|--------------------------|---------------------------------|
| Loss                     | 0.318                           |
| Grad Norm                | 0.6617229580879211              |
| Learning Rate            | 5.119453924914675e-07           |
| Epoch                    | 19.97                           |

## Training Runtime

| Metric                   | Value                           |
|--------------------------|---------------------------------|
| Train Runtime            | 19583.1034                      |
| Train Samples Per Second | 459.58                          |
| Train Steps Per Second   | 1.795                           |
| Train Loss               | 0.341879387194793               |
| Epoch                    | 20.0                            |