julien-c HF staff commited on
Commit
66c3c7a
1 Parent(s): fee3eab

Migrate model card from transformers-repo

Browse files

Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/NeuML/bert-small-cord19/README.md

Files changed (1) hide show
  1. README.md +25 -0
README.md ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # BERT-Small fine-tuned on CORD-19 dataset
2
+
3
+ [BERT L6_H-512_A-8 model](https://huggingface.co/google/bert_uncased_L-6_H-512_A-8) fine-tuned on the [CORD-19 dataset](https://www.semanticscholar.org/cord19).
4
+
5
+ ## CORD-19 data subset
6
+ The training data for this dataset is stored as a [Kaggle dataset](https://www.kaggle.com/davidmezzetti/cord19-qa?select=cord19.txt). The training
7
+ data is a subset of the full corpus, focusing on high-quality, study-design detected articles.
8
+
9
+ ## Building the model
10
+
11
+ ```bash
12
+ python run_language_modeling.py
13
+ --model_type bert
14
+ --model_name_or_path google/bert_uncased_L-6_H-512_A-8
15
+ --do_train
16
+ --mlm
17
+ --line_by_line
18
+ --block_size 512
19
+ --train_data_file cord19.txt
20
+ --per_gpu_train_batch_size 4
21
+ --learning_rate 3e-5
22
+ --num_train_epochs 3.0
23
+ --output_dir bert-small-cord19
24
+ --save_steps 0
25
+ --overwrite_output_dir