--- pipeline_tag: sentence-similarity license: apache-2.0 tags: - sentence-transformers - feature-extraction - sentence-similarity - transformers --- # kornwtp/ConGen-paraphrase-multilingual-mpnet-base-v2 This is a [ConGen](https://github.com/KornWtp/ConGen) model: It maps sentences to a 768 dimensional dense vector space and can be used for tasks like semantic search. ## Usage Using this model becomes easy when you have [ConGen](https://github.com/KornWtp/ConGen) installed: ``` pip install -U git+https://github.com/KornWtp/ConGen.git ``` Then you can use the model like this: ```python from sentence_transformers import SentenceTransformer sentences = ["กลุ่มผู้ชายเล่นฟุตบอลบนชายหาด", "กลุ่มเด็กชายกำลังเล่นฟุตบอลบนชายหาด"] model = SentenceTransformer('kornwtp/ConGen-paraphrase-multilingual-mpnet-base-v2') embeddings = model.encode(sentences) print(embeddings) ``` ## Evaluation Results For an automated evaluation of this model, see the *Thai Sentence Embeddings Benchmark*: [Semantic Textual Similarity](https://github.com/KornWtp/ConGen#thai-semantic-textual-similarity-benchmark) ## Citing & Authors ```bibtex @inproceedings{limkonchotiwat-etal-2022-congen, title = "{ConGen}: Unsupervised Control and Generalization Distillation For Sentence Representation", author = "Limkonchotiwat, Peerat and Ponwitayarat, Wuttikorn and Lowphansirikul, Lalita and Udomcharoenchaikit, Can and Chuangsuwanich, Ekapol and Nutanong, Sarana", booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2022", year = "2022", publisher = "Association for Computational Linguistics", } ```