Edit model card

DeBERTa V2 base Japanese

This is a DeBERTaV2 model pretrained on Japanese texts. The codes for the pretraining are available at retarfi/language-pretraining.

How to use

You can use this model for masked language modeling as follows:

from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("izumi-lab/deberta-v2-base-japanese", use_fast=False)
model = AutoModelForMaskedLM.from_pretrained("izumi-lab/deberta-v2-base-japanese")
...

Tokenization

The model uses a sentencepiece-based tokenizer, the vocabulary was trained on the Japanese Wikipedia using sentencepiece.

Training Data

We used the following corpora for pre-training:

Training Parameters

learning_rate in parentheses indicate the learning rate for additional pre-training with the financial corpus.

  • learning_rate: 2.4e-4 (6e-5)
  • total_train_batch_size: 2,016
  • max_seq_length: 512
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-06
  • lr_scheduler_type: linear schedule with warmup
  • training_steps: 1,000,000
  • warmup_steps: 100,000
  • precision: FP16

Fine-tuning on General NLU tasks

We evaluate our model with the average of five seeds.
Other models are from JGLUE repository

Model JSTS JNLI JCommonsenseQA
Pearson/Spearman acc acc
DeBERTaV2 base 0.919/0.882 0.912 0.859
Waseda RoBERTa base 0.913/0.873 0.895 0.840
Tohoku BERT base 0.909/0.868 0.899 0.808

Citation

Citation will be updated. Please check when you would cite.

@article{Suzuki-etal-2023-ipm,
  title = {Constructing and analyzing domain-specific language model for financial text mining},
  author = {Masahiro Suzuki and Hiroki Sakaji and Masanori Hirano and Kiyoshi Izumi},
  journal = {Information Processing \& Management},
  volume = {60},
  number = {2},
  pages = {103194},
  year = {2023},
  doi = {10.1016/j.ipm.2022.103194}
}

Licenses

The pretrained models are distributed under the terms of the Creative Commons Attribution-ShareAlike 4.0.

Acknowledgments

This work was supported in part by JSPS KAKENHI Grant Number JP21K12010, and the JST-Mirai Program Grant Number JPMJMI20B1, Japan.

Downloads last month
844

Datasets used to train izumi-lab/deberta-v2-base-japanese