Model Details

Model Description

The XLM-RoBERTa model was proposed in Unsupervised Cross-lingual Representation Learning at Scale by Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer and Veselin Stoyanov. It is based on Facebook's RoBERTa model released in 2019. It is a large multi-lingual language model, trained on 2.5TB of filtered CommonCrawl data. This model is XLM-RoBERTa-large fine-tuned with the ner-wikipedia-dataset dataset in Japanese.

Developed by: See associated paper
Model type: Multi-lingual language NER model
Language(s) (NLP): Japanese
License: [More Information Needed]
Finetuned from model [optional]: XLM-RoBERTa-large

Each token is labeled using IO scheme with tags as follow:

Label id	Tag	Tag in Widget	Description
0	O	(None)	others or nothing
1	PER	PER	person
2	ORG	ORG	general corporation organization
3	ORG-P	P	political organization
4	ORG-O	O	other organization
5	LOC	LOC	location
6	INS	INS	institution, facility
7	PRD	PRD	product
8	EVT	EVT	event

Training Details

See the following resources for training data and training procedure details:

Training procedure

Source code for fine-tuning is heavily inspired by this repository with a little bit of modification.

Training Hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05 (default)
train_batch_size: 12
eval_batch_size: 12
seed: 42 (default)
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 (default)
lr_scheduler_type: linear (default)
weight_decay = 0.01
num_epochs: 5

Evaluation

Training Loss	Epoch	Validation Loss	Precision	Recall	F1	Accuracy
No log	1.0	0.1645	0.7581	0.8235	0.7894	0.9540
No log	2.0	0.1523	0.8153	0.8414	0.8281	0.9611
No log	3.0	0.1188	0.8416	0.8741	0.8575	0.9683
No log	4.0	0.1320	0.8621	0.8935	0.8775	0.9725
No log	5.0	0.1422	0.8796	0.9032	0.8913	0.9728

Testing Data, Factors & Metrics

Testing Data

XLM-RoBERTa-large model card with 9:1 train-eval ratio.

rizkyfoxcale
/

xlm-roberta-large-ner-ja