This model, (DeLADE+[CLS])+, is trained by fusing neural lexical and semantic components in single transformer using DistilBERT as a backbone using hard negative mining and knowledge distillation with ColBERT teacher, which is detailed in the below paper. *[A Dense Representation Framework for Lexical and Semantic Matching](https://arxiv.org/pdf/2112.04666.pdf)* Sheng-Chieh Lin and Jimmy Lin. You can find the usage of the model in our [DHR repo](https://github.com/jacklin64/DHR): (1) [Inference on MSMARCO Passage Ranking](https://github.com/castorini/DHR/blob/main/docs/msmarco-passage-train-eval.md); (2) [Inference on BEIR datasets](https://github.com/castorini/DHR/blob/main/docs/beir-eval.md).