MinzaKhan commited on
Commit
ca8e410
1 Parent(s): 5595ead

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -0
README.md CHANGED
@@ -52,3 +52,20 @@ The corpus was created by downloading and combining 14 novels of the famous auth
52
  The corpus consists of 14 novels written by H G Wells downloaded from Project Gutenberg. The text added by Project Gutenberg at the beginning and end of each novel were removed. Then the entire text in each novel
53
  was converted into one line. Then the single line was broken into 20 parts. In this way 20 lines were generated for each novel. The lines from each novel were then combined and
54
  stored in a single text file. This text file was then used to finetune the model.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
  The corpus consists of 14 novels written by H G Wells downloaded from Project Gutenberg. The text added by Project Gutenberg at the beginning and end of each novel were removed. Then the entire text in each novel
53
  was converted into one line. Then the single line was broken into 20 parts. In this way 20 lines were generated for each novel. The lines from each novel were then combined and
54
  stored in a single text file. This text file was then used to finetune the model.
55
+
56
+ The values of the parameters used during finetuning are:
57
+
58
+ batch_size = 2
59
+
60
+ max length = 1024
61
+
62
+ epochs = 10
63
+
64
+ learning rate = 5e-4
65
+
66
+ warmup steps = 1e2
67
+
68
+ The corpus has been uploaded on HuggingFace. It can be accessed from the following link: https://huggingface.co/datasets/MinzaKhan/HGWells
69
+
70
+
71
+