Viona commited on
Commit
641de86
2 Parent(s): 62b438a 9bceee7

Merge branch 'main' of https://huggingface.co/recobo/agriculture-bert-uncased into main

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -11,7 +11,7 @@ widget:
11
  A BERT-based language model further pre-trained from the checkpoint of [SciBERT](https://huggingface.co/allenai/scibert_scivocab_uncased).
12
  The dataset gathered is a balance between scientific and general works in agriculture domain and encompassing knowledge from different areas of agriculture research and practical knowledge.
13
 
14
- The corpus contains 1.3 million paragraphs from National Agricultural Library (NAL) from the US Gov. and 4.2 million paragraphs from books and common literature from the **Agriculture Domain**.
15
 
16
  The self-supervised learning approach of MLM was used to train the model.
17
  - Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run
@@ -23,8 +23,8 @@ The self-supervised learning approach of MLM was used to train the model.
23
  from transformers import pipeline
24
  fill_mask = pipeline(
25
  "fill-mask",
26
- model="recobo/chemical-bert-uncased",
27
- tokenizer="recobo/chemical-bert-uncased"
28
  )
29
- fill_mask("we create [MASK]")
30
  ```
 
11
  A BERT-based language model further pre-trained from the checkpoint of [SciBERT](https://huggingface.co/allenai/scibert_scivocab_uncased).
12
  The dataset gathered is a balance between scientific and general works in agriculture domain and encompassing knowledge from different areas of agriculture research and practical knowledge.
13
 
14
+ The corpus contains 1.2 million paragraphs from National Agricultural Library (NAL) from the US Gov. and 5.3 million paragraphs from books and common literature from the **Agriculture Domain**.
15
 
16
  The self-supervised learning approach of MLM was used to train the model.
17
  - Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run
 
23
  from transformers import pipeline
24
  fill_mask = pipeline(
25
  "fill-mask",
26
+ model="recobo/agriculture-bert-uncased",
27
+ tokenizer="recobo/agriculture-bert-uncased"
28
  )
29
+ fill_mask("[MASK] is the practice of cultivating plants and livestock.")
30
  ```