--- extra_gated_prompt: "You agree to adhere to all terms and conditions for using the model as specified by the IEA License Agreement." extra_gated_fields: Company: text Country: country Specific date: date_picker I want to use this model for: type: select options: - Research - Education - label: Other value: other I agree to use this model for non-commercial use ONLY: checkbox I agree to not redistribute the data or share access credentials: checkbox I agree to cite the IEA model source in any publications or presentations: checkbox I understand that ICILS is a registered trademark of IEA and is protected by trademark law: checkbox I agree that the use of the model for assessments or learning materials requires prior notice to IEA: checkbox license: mit base_model: jjzha/esco-xlm-roberta-large datasets: - ICILS/multilingual_parental_occupations pipeline_tag: text-classification metrics: - accuracy - danieldux/isco_hierarchical_accuracy widget: - text: Beauticians and Related Workers example_title: Example 1 - text: She is a beautition at hair and beauty. She owns a hair and beauty salon example_title: Example 2 - text: "Retired. Doesn't work anymore." example_title: Example 3 - text: Ingeniero civil. ayuda en construcciones example_title: Example 4 tags: - ISCO - ESCO - occupation coding - ICILS language: - da - de - en - es - fi - fr - it - kk - ko - kz - pt - ro - ru - sv model-index: - name: xlm-r-icils-ilo results: - task: name: Text Classification type: text-classification dataset: name: ICILS/multilingual_parental_occupations type: ICILS/multilingual_parental_occupations config: icils split: test args: icils metrics: - name: Accuracy type: accuracy value: 0.6285 - name: ISCO Hierarchical Accuracy type: danieldux/isco_hierarchical_accuracy value: 0.95 library_name: transformers --- # Model Card for ICILS XLM-R ISCO This model is a fine-tuned version of [ESCOXLM-R](https://huggingface.co/jjzha/esco-xlm-roberta-large) trained on [The ICILS Multilingual ISCO-08 Parental Occupation Corpus](https://huggingface.co/datasets/ICILS/multilingual_parental_occupations). A R&D report explaining the research is available at [https://www.iea.nl/publications/rd-outcomes/improving-parental-occupation-coding-procedures-ai](https://www.iea.nl/publications/rd-outcomes/improving-parental-occupation-coding-procedures-ai). It achieves the following results on the test split: - Loss: 1.7849 - Accuracy: 0.6285 - Hierarchical Accuracy: 0.95 The research paper, [ESCOXLM-R: Multilingual Taxonomy-driven Pre-training for the Job Market Domain](https://aclanthology.org/2023.acl-long.662/), states "ESCOXLM-R, based on XLM-R-large, uses domain-adaptive pre-training on the [European Skills, Competences, Qualifications and Occupations](https://esco.ec.europa.eu/en/classification/occupation-main) (ESCO) taxonomy, covering 27 languages. The pre-training objectives for ESCOXLM-R include dynamic masked language modeling and a novel additional objective for inducing multilingual taxonomical ESCO relations" (Zhang et al., ACL 2023). ## Model Details ### Model Description IEA is an international cooperative of national research institutions, governmental research agencies, scholars, and analysts working to research, understand, and improve education worldwide. - **Developed by:** [The International Computer and Information Literacy Study](https://www.iea.nl/studies/iea/icils) - **Funded by:** [IEA International Association for the Evaluation of Educational Achievement](https://www.iea.nl/) - **Shared by [optional]:** [More Information Needed] - **Model type:** [More Information Needed] - **Language(s) (NLP):** [More Information Needed] - **License:** [More Information Needed] - **Finetuned from model:** [ESCOXLM-R](https://huggingface.co/jjzha/esco-xlm-roberta-large) ### Model Sources - **Repository:** [More Information Needed] - **Paper:** [Improving parental occupation coding procedures AI](https://www.iea.nl/publications/rd-outcomes/improving-parental-occupation-coding-procedures-ai) - **Demo:** [https://huggingface.co/spaces/ICILS/ICILS-XLM-R-ISCO](https://huggingface.co/spaces/ICILS/ICILS-XLM-R-ISCO) ## Uses ### Direct Use [More Information Needed] ### Downstream Use [optional] [More Information Needed] ### Out-of-Scope Use [More Information Needed] ## Bias, Risks, and Limitations [More Information Needed] ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model Use the code below to get started with the model. [More Information Needed] ## Training Details ### Training Data [More Information Needed] ### Training Procedure #### Preprocessing [optional] [More Information Needed] ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 12.0 ### Training results | Training Loss | Epoch | Step | Accuracy | Validation Loss | |:-------------:|:-----:|:-----:|:--------:|:---------------:| | 3.2269 | 1.0 | 3518 | 0.4176 | 2.9434 | | 2.2851 | 2.0 | 7036 | 0.5250 | 2.2479 | | 1.937 | 3.0 | 10554 | 0.5691 | 1.9822 | | 1.4695 | 4.0 | 14072 | 0.6018 | 1.8560 | | 1.2157 | 5.0 | 17590 | 0.6114 | 1.8160 | | 0.9819 | 6.0 | 21108 | 0.6214 | 1.7946 | | 0.8608 | 7.0 | 24626 | 0.6285 | 1.7849 | | 0.8374 | 8.0 | 28144 | 0.6353 | 1.7893 | | 0.7908 | 9.0 | 31662 | 1.8279 | 0.6239 | | 0.6962 | 10.0 | 35180 | 1.8472 | 0.6347 | | 0.6371 | 11.0 | 38698 | 1.8669 | 0.6339 | | 0.5226 | 12.0 | 42216 | 1.8695 | 0.6336 | ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data The model was trained on the `icils` configuration of the ISCO-08 dataset using the train and validation splits and evaluated on the test split. #### Factors [More Information Needed] #### Metrics [More Information Needed] ### Results [More Information Needed] #### Summary ## Model Examination [optional] [More Information Needed] ## Technical Specifications [optional] ### Model Architecture and Objective [More Information Needed] ### Compute Infrastructure [More Information Needed] #### Hardware [More Information Needed] #### Software ### Framework versions - Transformers 4.40.0.dev0 - Pytorch 2.2.1+cu121 - Datasets 2.18.0 - Tokenizers 0.15.2 ## Citation [optional] **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional] [More Information Needed] ## More Information [optional] [More Information Needed] ## Model Card Authors [optional] [More Information Needed] ## Model Card Contact [More Information Needed]