--- license: mit language: - 'no' - nn - da - sv - en --- # Scandinavian Education Classifier Snowflake ## !!! We recomment using [our bert-based model](https://huggingface.co/north/scandinavian_education_classifier_bert) instead for production Trained using code from: [CosmoPedia)[]https://github.com/huggingface/cosmopedia/tree/main/classification], and the [nb-bert-base](https://huggingface.co/Snowflake/snowflake-arctic-embed-m) as starting point. The [data](https://huggingface.co/datasets/north/scandinavian-llama3-annotations) used in classification is from [GlotCC](https://huggingface.co/datasets/cis-lmu/GlotCC-V1) and have been annotated using Gemini 1.5 Flash. The following command where used for training: ``` python train_edu_bert.py --base_model_name="NbAiLab/nb-bert-base" --dataset_name="north/scandinavian-educational-annotations" --target_column="score" --checkpoint_dir="/home/pere/checkpoints/scandinavian_bert/" ``` ## Classification Report | Class | Precision | Recall | F1-Score | Support | |-----------|-----------|--------|----------|---------| | 0 | 0.76 | 0.64 | 0.70 | 18274 | | 1 | 0.63 | 0.76 | 0.69 | 23348 | | 2 | 0.48 | 0.40 | 0.43 | 6621 | | 3 | 0.57 | 0.28 | 0.38 | 1314 | | 4 | 0.56 | 0.06 | 0.12 | 433 | | 5 | 0.00 | 0.00 | 0.00 | 10 | | Metric | Value | |------------------|-------| | Accuracy | 0.65 | | Macro Avg | | | - Precision | 0.50 | | - Recall | 0.36 | | - F1-Score | 0.38 | | Weighted Avg | | | - Precision | 0.65 | | - Recall | 0.65 | | - F1-Score | 0.64 | | Total Support | 50000 | ## Confusion Matrix | | Class 0 | Class 1 | Class 2 | Class 3 | Class 4 | Class 5 | |-------|---------|---------|---------|---------|---------|---------| | Class 0 | 11725 | 6460 | 88 | 1 | 0 | 0 | | Class 1 | 3598 | 17758 | 1978 | 14 | 0 | 0 | | Class 2 | 128 | 3733 | 2618 | 142 | 0 | 0 | | Class 3 | 6 | 272 | 645 | 369 | 22 | 0 | | Class 4 | 2 | 121 | 161 | 121 | 28 | 0 | | Class 5 | 0 | 2 | 8 | 0 | 0 | 0 | ## Evaluation Metrics | Metric | Value | |--------------------------|---------------------------------| | Eval Loss | 0.3311704695224762 | | Eval Precision | 0.49857140934204414 | | Eval Recall | 0.35718277242555724 | | Eval F1 Macro | 0.38442290605864393 | | Eval Accuracy | 0.64996 | | Eval Runtime | 86.1773 | | Eval Samples Per Second | 580.199 | | Eval Steps Per Second | 4.537 | | Epoch | 19.91 | ## Training Metrics | Metric | Value | |--------------------------|---------------------------------| | Loss | 0.318 | | Grad Norm | 0.6617229580879211 | | Learning Rate | 5.119453924914675e-07 | | Epoch | 19.97 | ## Training Runtime | Metric | Value | |--------------------------|---------------------------------| | Train Runtime | 19583.1034 | | Train Samples Per Second | 459.58 | | Train Steps Per Second | 1.795 | | Train Loss | 0.341879387194793 | | Epoch | 20.0 |