chrisociepa commited on
Commit
35a3b38
1 Parent(s): 82c8a2c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -41,7 +41,7 @@ APT2-1B-Base is a base model introducing a new series of the APT2 (Azurro Pretra
41
 
42
  APT2-1B-Base is an autoregressive language model based on the architecture of a transformer. It has been trained with data collected before April 2023.
43
 
44
- 30 billion tokens have been used for training, and the training dataset (the Polish corpus) has over 7 billion tokens. Chinchilla’s scaling law has been applied (20 tokens per every model parameter).
45
 
46
  A special tokenizer has been prepared and trained for the purpose of training the model.
47
 
 
41
 
42
  APT2-1B-Base is an autoregressive language model based on the architecture of a transformer. It has been trained with data collected before April 2023.
43
 
44
+ 30 billion tokens have been used for training, and the training dataset (the Polish corpus) has over 7 billion tokens.
45
 
46
  A special tokenizer has been prepared and trained for the purpose of training the model.
47