File size: 4,659 Bytes
5ec91d4
7874298
 
5ec91d4
7874298
5ec91d4
7874298
 
5ec91d4
7874298
 
 
5ec91d4
 
 
 
7874298
5ec91d4
7874298
5ec91d4
7874298
5ec91d4
7874298
5ec91d4
7874298
 
 
 
bc85b0f
7874298
 
5ec91d4
7874298
5ec91d4
7874298
5ec91d4
7874298
9c434b9
5ec91d4
7874298
5ec91d4
7874298
5ec91d4
7874298
5ec91d4
7874298
5ec91d4
7874298
5ec91d4
7874298
5ec91d4
7874298
 
 
5ec91d4
7874298
 
 
5ec91d4
7874298
d9972f3
5ec91d4
7874298
 
5ec91d4
7874298
 
5ec91d4
 
7874298
5ec91d4
7874298
5ec91d4
7874298
 
 
 
 
 
 
 
 
 
 
 
c90191a
7874298
5ec91d4
7874298
9c434b9
5ec91d4
 
 
 
 
 
7874298
 
5ec91d4
 
 
7874298
 
 
 
2ab1e5c
7874298
2ab1e5c
7874298
2ab1e5c
 
 
d1af867
2ab1e5c
 
 
 
7874298
 
 
5ec91d4
7874298
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
---
library_name: hierarchy-transformers
pipeline_tag: feature-extraction
tags:
- hierarchy-transformers
- feature-extraction
- hierarchy-encoding
- subsumption-relationships
- transformers
license: apache-2.0
language:
- en
---

# Hierarchy-Transformers/HiT-MPNet-WordNetNoun-Hard

A **Hi**erarchy **T**ransformer Encoder (HiT) model that explicitly encodes entities according to their hierarchical relationships.

### Model Description

<!-- Provide a longer summary of what this model is. -->

HiT-MPNet-WordNetNoun-Hard is a HiT model trained on WordNet's noun hierarchy with **hard** negative sampling.

- **Developed by:** [Yuan He](https://www.yuanhe.wiki/), Zhangdie Yuan, Jiaoyan Chen, and Ian Horrocks
- **Model type:** Hierarchy Transformer Encoder (HiT)
- **License:** Apache license 2.0
- **Hierarchy**: WordNet (Noun)
- **Training Dataset**: Download `wordnet.zip` from [Datasets for HiTs on Zenodo](https://zenodo.org/doi/10.5281/zenodo.10511042)
- **Pre-trained model:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2)
- **Training Objectives**: Jointly optimised on *hyperbolic clustering* and *hyperbolic centripetal* losses

### Model Sources

<!-- Provide the basic links for the model. -->

- **Repository:** https://github.com/KRR-Oxford/HierarchyTransformers
- **Paper:** [Language Models as Hierarchy Encoders](https://arxiv.org/abs/2401.11374)

## Usage

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

HiT models are used to encode entities (presented as texts) and predict their hierarhical relationships in hyperbolic space. 

### Get Started

Install `hierarchy_transformers` (check our [repository](https://github.com/KRR-Oxford/HierarchyTransformers)) through `pip` or `GitHub`.

Use the code below to get started with the model.

```python
from hierarchy_transformers import HierarchyTransformer
from hierarchy_transformers.utils import get_torch_device

# set up the device (use cpu if no gpu found)
gpu_id = 0
device = get_torch_device(gpu_id)

# load the model
model = HierarchyTransformer.load_pretrained('Hierarchy-Transformers/HiT-MPNet-WordNetNoun-Hard', device)

# entity names to be encoded.
entity_names = ["computer", "personal computer", "fruit", "berry"]

# get the entity embeddings
entity_embeddings = model.encode(entity_names)
```

### Default Probing for Subsumption Prediction

Use the entity embeddings to predict the subsumption relationships between them.

```python
# suppose we want to compare "personal computer" and "computer", "berry" and "fruit"
child_entity_embeddings = model.encode(["personal computer", "berry"], convert_to_tensor=True)
parent_entity_embeddings = model.encode(["computer", "fruit"], convert_to_tensor=True)

# compute the hyperbolic distances and norms of entity embeddings
dists = model.manifold.dist(child_entity_embeddings, parent_entity_embeddings)
child_norms = model.manifold.dist0(child_entity_embeddings)
parent_norms = model.manifold.dist0(parent_entity_embeddings)

# use the empirical function for subsumption prediction proposed in the paper
# `centri_score_weight` and the overall threshold are determined on the validation set
subsumption_scores = - (dists + centri_score_weight * (parent_norms - child_norms))
```

Training and evaluation scripts are available at [GitHub](https://github.com/KRR-Oxford/HierarchyTransformers).
Technical details are presented in the [paper](https://arxiv.org/abs/2401.11374).



## Full Model Architecture
```
HierarchyTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False})
)
```

## Citation

<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

Preprint on arxiv: https://arxiv.org/abs/2401.11374.

*Yuan He, Zhangdie Yuan, Jiaoyan Chen, Ian Horrocks.* **Language Models as Hierarchy Encoders.** arXiv preprint arXiv:2401.11374 (2024).

```
@article{he2024language,
  title={Language Models as Hierarchy Encoders},
  author={He, Yuan and Yuan, Zhangdie and Chen, Jiaoyan and Horrocks, Ian},
  journal={arXiv preprint arXiv:2401.11374},
  year={2024}
}
```


## Model Card Contact

For any queries or feedback, please contact Yuan He (yuan.he@cs.ox.ac.uk).