Model Details
Model Description
The XLM-RoBERTa model was proposed in Unsupervised Cross-lingual Representation Learning at Scale by Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer and Veselin Stoyanov. It is based on Facebook's RoBERTa model released in 2019. It is a large multi-lingual language model, trained on 2.5TB of filtered CommonCrawl data. This model is XLM-RoBERTa-large fine-tuned with the ner-wikipedia-dataset dataset in Japanese.
- Developed by: See associated paper
- Model type: Multi-lingual language NER model
- Language(s) (NLP): Japanese
- License: [More Information Needed]
- Finetuned from model [optional]: XLM-RoBERTa-large
Each token is labeled using IO scheme with tags as follow:
Label id | Tag | Tag in Widget | Description |
---|---|---|---|
0 | O | (None) | others or nothing |
1 | PER | PER | person |
2 | ORG | ORG | general corporation organization |
3 | ORG-P | P | political organization |
4 | ORG-O | O | other organization |
5 | LOC | LOC | location |
6 | INS | INS | institution, facility |
7 | PRD | PRD | product |
8 | EVT | EVT | event |
Training Details
See the following resources for training data and training procedure details:
Training procedure
Source code for fine-tuning is heavily inspired by this repository with a little bit of modification.
Training Hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05 (default)
- train_batch_size: 12
- eval_batch_size: 12
- seed: 42 (default)
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 (default)
- lr_scheduler_type: linear (default)
- weight_decay = 0.01
- num_epochs: 5
Evaluation
Training Loss | Epoch | Validation Loss | Precision | Recall | F1 | Accuracy |
---|---|---|---|---|---|---|
No log | 1.0 | 0.1645 | 0.7581 | 0.8235 | 0.7894 | 0.9540 |
No log | 2.0 | 0.1523 | 0.8153 | 0.8414 | 0.8281 | 0.9611 |
No log | 3.0 | 0.1188 | 0.8416 | 0.8741 | 0.8575 | 0.9683 |
No log | 4.0 | 0.1320 | 0.8621 | 0.8935 | 0.8775 | 0.9725 |
No log | 5.0 | 0.1422 | 0.8796 | 0.9032 | 0.8913 | 0.9728 |
Testing Data, Factors & Metrics
Testing Data
XLM-RoBERTa-large model card with 9:1 train-eval ratio.
- Downloads last month
- 11