dlicari commited on
Commit
f9ec94b
·
1 Parent(s): 01ea850

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -1
README.md CHANGED
@@ -13,4 +13,32 @@ It is the [ITALIAN-LEGAL-BERT](https://huggingface.co/dlicari/Italian-Legal-BERT
13
  It was trained from scratch using a larger training dataset, 6.6GB of civil and criminal cases.
14
  We used [CamemBERT](https://huggingface.co/docs/transformers/main/en/model_doc/camembert) architecture with a language modeling head on top, AdamW Optimizer, initial learning rate 2e-5 (with linear learning rate decay), sequence length 512, batch size 18, 1 million training steps,
15
  device 8*NVIDIA A100 40GB using distributed data parallel (each step performs 8 batches). It uses SentencePiece tokenization trained from scratch on a subset of training set (5 milions sentences)
16
- and vocabulary size of 32000
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  It was trained from scratch using a larger training dataset, 6.6GB of civil and criminal cases.
14
  We used [CamemBERT](https://huggingface.co/docs/transformers/main/en/model_doc/camembert) architecture with a language modeling head on top, AdamW Optimizer, initial learning rate 2e-5 (with linear learning rate decay), sequence length 512, batch size 18, 1 million training steps,
15
  device 8*NVIDIA A100 40GB using distributed data parallel (each step performs 8 batches). It uses SentencePiece tokenization trained from scratch on a subset of training set (5 milions sentences)
16
+ and vocabulary size of 32000
17
+
18
+
19
+ <h2> Usage </h2>
20
+
21
+ ITALIAN-LEGAL-BERT model can be loaded like:
22
+
23
+ ```python
24
+ from transformers import AutoModel, AutoTokenizer
25
+ model_name = "dlicari/Italian-Legal-BERT-SC"
26
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
27
+ model = AutoModel.from_pretrained(model_name)
28
+ ```
29
+
30
+ You can use the Transformers library fill-mask pipeline to do inference with ITALIAN-LEGAL-BERT.
31
+ ```python
32
+ # %pip install sentencepiece
33
+ # %pip install transformers
34
+
35
+ from transformers import pipeline
36
+ model_name = "dlicari/Italian-Legal-BERT-SC"
37
+ fill_mask = pipeline("fill-mask", model_name)
38
+ fill_mask("Il <mask> ha chiesto revocarsi l'obbligo di pagamento")
39
+ # [{'score': 0.6529251933097839,'token_str': 'ricorrente',
40
+ # {'score': 0.0380014143884182, 'token_str': 'convenuto',
41
+ # {'score': 0.0360226035118103, 'token_str': 'richiedente',
42
+ # {'score': 0.023908283561468124,'token_str': 'Condominio',
43
+ # {'score': 0.020863816142082214, 'token_str': 'lavoratore'}]
44
+ ```