Transformers
English
word_sense_disambiguation
Inference Endpoints
Edit model card

Semantic Specialization for Knowledge-based Word Sense Disambiguation

  • This repository contains the trained model (projection heads) and sense/context embeddings used for training and evaluating the model.
  • If you want to learn how to use these files, please refer to the semantic_specialization_for_wsd repository.

Trained Model (Projection Heads)

  • File: checkpoints/baseline/last.ckpt
  • This is one of the trained models used for reporting the main results (Table 2 in [Mizuki and Okazaki, EACL2023]).
    NOTE: Five runs were performed in total.
  • The main hyperparameters used for training are as follows:
Argument name Value Description
max_epochs 15 Maximum number of training epochs
cfg_similarity_class.temperature ($\beta^{-1}$) 0.015625 (=1/64) Temperature parameter for the contrastive loss
batch_size ($N_B$) 256 Number of samples in each batch for the attract-repel and self-training objectives
coef_max_pool_margin_loss ($\alpha$) 0.2 Coefficient for the self-training loss
cfg_gloss_projection_head.n_layer 2 Number of FFNN layers for the projection heads
cfg_gloss_projection_head.max_l2_norm_ratio ($\epsilon$) 0.015 Hyperparameter for the distance constraint integrated in the projection heads

Sense/context embeddings

  • Directory: data/bert_embeddings/
  • Sense embeddings: bert-large-cased_WordNet_Gloss_Corpus.hdf5
  • Context embeddings for the self-training objective: bert-large-cased_SemCor.hdf5
  • Context embeddings for evaluating the WSD task: bert-large-cased_WSDEval-ALL.hdf5

Reference

@inproceedings{Mizuki:EACL2023,
    title     = "Semantic Specialization for Knowledge-based Word Sense Disambiguation",
    author    = "Mizuki, Sakae and Okazaki, Naoaki",
    booktitle = "Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume",
    series = {EACL},
    month = may,
    year = "2023",
    address = "Dubrovnik, Croatia",
    publisher = "Association for Computational Linguistics",
    pages = "3449--3462",
}
Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .