---
pipeline_tag: sentence-similarity
tags:
- formula-transformers
- feature-extraction
- formula-similarity

---

# CLFE(ConMath)

This is a formula embedding model trained on Latex, Presentation MathML and Content MathML of formulas: It maps formulas to a 768 dimensional dense vector space. It was introduced in https://link.springer.com/chapter/10.1007/978-981-99-7254-8_8

<!--- Describe your model here -->

## Usage


```
pip install -U sentence-transformers
```
Put 'MarkuplmTransformerForConMATH.py' into 'sentence_transfomers/models', and add 'from .MarkuplmTransformerForConMATH import MarkuplmTransformerForConMATH' into 'sentence_transfomers/models/\_init\_'

Then you can use the model like this:

```python
from sentence_transformers import SentenceTransformer
latex = r"13\times x"
pmml = r"<math><semantics><mrow><mn>13</mn><mo>×</mo><mi>x</mi></mrow></semantics></math>"
cmml = r"<math><apply><times></times><cn>13</cn><ci>x</ci></apply></math>"

model = SentenceTransformer('Jyiyiyiyi/CLFE_ConMath')

embedding_latex = model.encode([{'latex': latex}])
embedding_pmml = model.encode([{'mathml': pmml}])
embedding_cmml = model.encode([{'mathml': cmml}])

print('latex embedding:')
print(embedding_latex)
print('Presentation MathML embedding:')
print(embedding_pmml)
print('Content MathML embedding:')
print(embedding_cmml)
```


## Full Model Architecture
```
SentenceTransformer(
  (0): Asym(
    (latex-0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel 
    (mathml-0): MarkuplmTransformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MarkupLMModel 
  )
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False})
)
```

## Citing & Authors

<!--- Describe where people can find more information -->
```
@inproceedings{wang2023math,
  title={Math Information Retrieval with Contrastive Learning of Formula Embeddings},
  author={Wang, Jingyi and Tian, Xuedong},
  booktitle={International Conference on Web Information Systems Engineering},
  pages={97--107},
  year={2023},
  organization={Springer}
}
```