Edit model card

Hierarchy-Transformers/HiT-MiniLM-L6-WordNetNoun

A Hierarchy Transformer Encoder (HiT) model that explicitly encodes entities according to their hierarchical relationships.

Model Description

HiT-MiniLM-L6-WordNet-Hard is a HiT model trained on WordNet's noun hierarchy with hard negative sampling.

  • Developed by: Yuan He, Zhangdie Yuan, Jiaoyan Chen, and Ian Horrocks
  • Model type: Hierarchy Transformer Encoder (HiT)
  • License: Apache license 2.0
  • Hierarchy: WordNet (Noun)
  • Training Dataset: Download wordnet.zip from Datasets for HiTs on Zenodo
  • Pre-trained model: sentence-transformers/all-MiniLM-L6-v2
  • Training Objectives: Jointly optimised on hyperbolic clustering and hyperbolic centripetal losses

Model Sources

Usage

HiT models are used to encode entities (presented as texts) and predict their hierarhical relationships in hyperbolic space.

Get Started

Install hierarchy_transformers (check our repository) through pip or GitHub.

Use the code below to get started with the model.

from hierarchy_transformers import HierarchyTransformer
from hierarchy_transformers.utils import get_torch_device

# set up the device (use cpu if no gpu found)
gpu_id = 0
device = get_torch_device(gpu_id)

# load the model
model = HierarchyTransformer.load_pretrained('Hierarchy-Transformers/HiT-MiniLM-L6-WordNetNoun-Hard', device)

# entity names to be encoded.
entity_names = ["computer", "personal computer", "fruit", "berry"]

# get the entity embeddings
entity_embeddings = model.encode(entity_names)

Default Probing for Subsumption Prediction

Use the entity embeddings to predict the subsumption relationships between them.

# suppose we want to compare "personal computer" and "computer", "berry" and "fruit"
child_entity_embeddings = model.encode(["personal computer", "berry"], convert_to_tensor=True)
parent_entity_embeddings = model.encode(["computer", "fruit"], convert_to_tensor=True)

# compute the hyperbolic distances and norms of entity embeddings
dists = model.manifold.dist(child_entity_embeddings, parent_entity_embeddings)
child_norms = model.manifold.dist0(child_entity_embeddings)
parent_norms = model.manifold.dist0(parent_entity_embeddings)

# use the empirical function for subsumption prediction proposed in the paper
# `centri_score_weight` and the overall threshold are determined on the validation set
subsumption_scores = - (dists + centri_score_weight * (parent_norms - child_norms))

Training and evaluation scripts are available at GitHub. Technical details are presented in the paper.

Full Model Architecture

HierarchyTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False})
)

Citation

Preprint on arxiv: https://arxiv.org/abs/2401.11374.

Yuan He, Zhangdie Yuan, Jiaoyan Chen, Ian Horrocks. Language Models as Hierarchy Encoders. arXiv preprint arXiv:2401.11374 (2024).

@article{he2024language,
  title={Language Models as Hierarchy Encoders},
  author={He, Yuan and Yuan, Zhangdie and Chen, Jiaoyan and Horrocks, Ian},
  journal={arXiv preprint arXiv:2401.11374},
  year={2024}
}

Model Card Contact

For any queries or feedback, please contact Yuan He (yuan.he@cs.ox.ac.uk).

Downloads last month
2
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference API
Inference API (serverless) does not yet support hierarchy-transformers models for this pipeline type.