--- inference: false language: pt datasets: - assin2 license: mit --- # DeBERTinha XSmall for Semantic Textual Similarity ## Full regression example ```python from transformers import AutoModelForSequenceClassification, AutoTokenizer, AutoConfig import numpy as np import torch model_name = "sagui-nlp/debertinha-ptbr-xsmall-assin2-sts" s1 = "A gente faz o aporte financeiro, é como se a empresa fosse parceira do Monte Cristo." s2 = "Fernando Moraes afirma que não tem vínculo com o Monte Cristo além da parceira." model = AutoModelForSequenceClassification.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name) config = AutoConfig.from_pretrained(model_name) model_input = tokenizer(*([s1], [s2]), padding=True, return_tensors="pt") with torch.no_grad(): output = model(**model_input) score = output[0][0].detach().numpy().item() print(f"Similarity Score: {np.round(float(score), 4)}") ``` ## Citation ``` @misc{campiotti2023debertinha, title={DeBERTinha: A Multistep Approach to Adapt DebertaV3 XSmall for Brazilian Portuguese Natural Language Processing Task}, author={Israel Campiotti and Matheus Rodrigues and Yuri Albuquerque and Rafael Azevedo and Alyson Andrade}, year={2023}, eprint={2309.16844}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```