--- license: apache-2.0 language: "en" tags: - bag-of-words - dense-passage-retrieval - knowledge-distillation datasets: - ms_marco --- # Uni-ColBERTer (Dim: 1) for Passage Retrieval If you want to know more about our (Uni-)ColBERTer architecture check out our paper: https://arxiv.org/abs/2203.13088 🎉 For more information, source code, and a minimal usage example please visit: https://github.com/sebastian-hofstaetter/colberter ## Limitations & Bias - The model is only trained on english text. - The model inherits social biases from both DistilBERT and MSMARCO. - The model is only trained on relatively short passages of MSMARCO (avg. 60 words length), so it might struggle with longer text. ## Citation If you use our model checkpoint please cite our work as: ``` @article{Hofstaetter2022_colberter, author = {Sebastian Hofst{\"a}tter and Omar Khattab and Sophia Althammer and Mete Sertkan and Allan Hanbury}, title = {Introducing Neural Bag of Whole-Words with ColBERTer: Contextualized Late Interactions using Enhanced Reduction}, publisher = {arXiv}, url = {https://arxiv.org/abs/2203.13088}, doi = {10.48550/ARXIV.2203.13088}, year = {2022}, } ```