Hierarchy-Transformers/HiT-MPNet-WordNetNoun

A Hierarchy Transformer Encoder (HiT) model that explicitly encodes entities according to their hierarchical relationships.

Model Description

HiT-MPNet-WordNetNoun is a HiT model trained on WordNet's subsumption (hypernym) hierarchy of noun entities.

Developed by: Yuan He, Zhangdie Yuan, Jiaoyan Chen, and Ian Horrocks
Model type: Hierarchy Transformer Encoder (HiT)
License: Apache license 2.0
Hierarchy: WordNet's subsumption (hypernym) hierarchy of noun entities.
Training Dataset: Hierarchy-Transformers/WordNetNoun
Pre-trained model: sentence-transformers/all-mpnet-base-v2
Training Objectives: Jointly optimised on Hyperbolic Clustering and Hyperbolic Centripetal losses (see definitions in the paper)

Model Versions

Version	Model Revision	Note
v1.0 (Random Negatives)	`main` or `v1-random-negatives`	The variant trained on random negatives, as detailed in the paper.
v1.0 (Hard Negatives)	`v1-hard-negatives`	The variant trained on hard negatives, as detailed in the paper.

Model Sources

Repository: https://github.com/KRR-Oxford/HierarchyTransformers
Paper: Language Models as Hierarchy Encoders

Usage

HiT models are used to encode entities (presented as texts) and predict their hierarhical relationships in hyperbolic space.

Get Started

Install hierarchy_transformers (check our repository) through pip or GitHub.

Use the code below to get started with the model.

from hierarchy_transformers import HierarchyTransformer

# load the model
model = HierarchyTransformer.from_pretrained('Hierarchy-Transformers/HiT-MiniLM-L12-WordNetNoun')

# entity names to be encoded.
entity_names = ["computer", "personal computer", "fruit", "berry"]

# get the entity embeddings
entity_embeddings = model.encode(entity_names)

Default Probing for Subsumption Prediction

Use the entity embeddings to predict the subsumption relationships between them.

# suppose we want to compare "personal computer" and "computer", "berry" and "fruit"
child_entity_embeddings = model.encode(["personal computer", "berry"], convert_to_tensor=True)
parent_entity_embeddings = model.encode(["computer", "fruit"], convert_to_tensor=True)

# compute the hyperbolic distances and norms of entity embeddings
dists = model.manifold.dist(child_entity_embeddings, parent_entity_embeddings)
child_norms = model.manifold.dist0(child_entity_embeddings)
parent_norms = model.manifold.dist0(parent_entity_embeddings)

# use the empirical function for subsumption prediction proposed in the paper
# `centri_score_weight` and the overall threshold are determined on the validation set
subsumption_scores = - (dists + centri_score_weight * (parent_norms - child_norms))

Train Your Own Models

Use the example scripts in our repository to reproduce existing models and train/evaluate your own models.

Full Model Architecture

HierarchyTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False})
)

Citation

Yuan He, Zhangdie Yuan, Jiaoyan Chen, Ian Horrocks. Language Models as Hierarchy Encoders. Advances in Neural Information Processing Systems 37 (NeurIPS 2024).

@inproceedings{NEURIPS2024_1a970a3e,
 author = {He, Yuan and Yuan, Moy and Chen, Jiaoyan and Horrocks, Ian},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {A. Globerson and L. Mackey and D. Belgrave and A. Fan and U. Paquet and J. Tomczak and C. Zhang},
 pages = {14690--14711},
 publisher = {Curran Associates, Inc.},
 title = {Language Models as Hierarchy Encoders},
 url = {https://proceedings.neurips.cc/paper_files/paper/2024/file/1a970a3e62ac31c76ec3cea3a9f68fdf-Paper-Conference.pdf},
 volume = {37},
 year = {2024}
}

Model Card Contact

For any queries or feedback, please contact Yuan He (yuan.he(at)cs.ox.ac.uk).

Hierarchy-Transformers
/

HiT-MPNet-WordNetNoun