ceyhunemreozturk commited on
Commit
2bc1922
1 Parent(s): 99e2463

A simple code block that shows the usage of the model was added.

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md CHANGED
@@ -13,6 +13,21 @@ We introduce BERTurk-Legal which is a transformer-based language model to retrie
13
 
14
  Test dataset can be accessed from the following link: https://github.com/koc-lab/yargitay_retrieval_dataset
15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  ## Citation
17
  If you use the model, please cite the following conference paper.
18
  ```
 
13
 
14
  Test dataset can be accessed from the following link: https://github.com/koc-lab/yargitay_retrieval_dataset
15
 
16
+ The model can be loaded and used to create document embeddings as follows. Then, the document embeddings can be utilized for retrieval.
17
+ ```
18
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
19
+
20
+ bert_model = "KocLab-Bilkent/BERTurk-Legal"
21
+
22
+ model = AutoModelForSequenceClassification.from_pretrained(bert_model, output_hidden_states=True)
23
+ tokenizer = AutoTokenizer.from_pretrained(bert_model)
24
+
25
+ tokens = tokenizer("Örnek metin") # a dummy text is provided as input
26
+
27
+ output = model(tokens)
28
+ docEmbeddings = output.hidden_states[-1]
29
+ ```
30
+
31
  ## Citation
32
  If you use the model, please cite the following conference paper.
33
  ```