ruanchaves
/

bert-base-portuguese-cased-assin2-similarity

Text Classification

Model card Files Files and versions Community

ruanchaves commited on Mar 29, 2023

Commit

16c6c7a

•

1 Parent(s): a41cca1

Create README.md

Files changed (1) hide show

README.md +59 -0

README.md ADDED Viewed

	@@ -0,0 +1,59 @@

+---
+inference: false
+language: pt
+datasets:
+- assin2
+---
+# BERTimbau base for Semantic Textual Similarity
+This is the [neuralmind/bert-base-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased) model finetuned for
+Semantic Textual Similarity with the [ASSIN 2](huggingface.co/datasets/assin2) dataset.
+This model is suitable for Portuguese.
+- Git Repo: [Evaluation of Portuguese Language Models](https://github.com/ruanchaves/eplm).
+- Demo: [Portuguese Semantic Similarity](https://ruanchaves-portuguese-semantic-similarity.hf.space)
+## Full regression example
+```python
+from transformers import AutoModelForSequenceClassification, AutoTokenizer, AutoConfig
+import numpy as np
+import torch
+model_name = "ruanchaves/bert-base-portuguese-cased-assin2-similarity"
+s1 = "A gente faz o aporte financeiro, é como se a empresa fosse parceira do Monte Cristo."
+s2 = "Fernando Moraes afirma que não tem vínculo com o Monte Cristo além da parceira."
+model = AutoModelForSequenceClassification.from_pretrained(model_name)
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+config = AutoConfig.from_pretrained(model_name)
+model_input = tokenizer(*([s1], [s2]), padding=True, return_tensors="pt")
+with torch.no_grad():
+    output = model(**model_input)
+    score = output[0][0].detach().numpy().item()
+    print(f"Similarity Score: {np.round(float(score), 4)}")
+```
+Output:
+```
+Similarity Score: 3.1819
+```
+## Citation
+Our research is ongoing, and we are currently working on describing our experiments in a paper, which will be published soon.
+In the meanwhile, if you would like to cite our work or models before the publication of the paper, please cite our [GitHub repository](https://github.com/ruanchaves/eplm):
+```
+@software{Chaves_Rodrigues_eplm_2023,
+author = {Chaves Rodrigues, Ruan and Tanti, Marc and Agerri, Rodrigo},
+doi = {10.5281/zenodo.7781848},
+month = {3},
+title = {{eplm}},
+url = {https://github.com/ruanchaves/eplm},
+version = {1.0.0},
+year = {2023}
+}
+```