This model, (DeLADE+[CLS])+, is trained by fusing neural lexical and semantic components in single transformer using DistilBERT as a backbone using hard negative mining and knowledge distillation with ColBERT teacher, which is detailed in the below paper.
A Dense Representation Framework for Lexical and Semantic Matching Sheng-Chieh Lin and Jimmy Lin.
You can find the usage of the model in our DHR repo: (1) Inference on MSMARCO Passage Ranking; (2) Inference on BEIR datasets.