Update README.md
Browse files
README.md
CHANGED
@@ -45,14 +45,4 @@ Fine tuned on a Graphcore IPU-POD64 using `popxl`.
|
|
45 |
|
46 |
Prompt sentences are tokenized and packed together to form 1024 token sequences, following [HF packing algorithm](https://github.com/huggingface/transformers/blob/v4.20.1/examples/pytorch/language-modeling/run_clm.py). No padding is used.
|
47 |
Since the model is trained to predict the next token, labels are simply the input sequence shifted by one token.
|
48 |
-
Given the training format, no extra care is needed to account for different sequences: the model does not need to know which sentence a token belongs to.
|
49 |
-
|
50 |
-
### Fine-tuning hyperparameters
|
51 |
-
The following hyperparameters were used:
|
52 |
-
|
53 |
-
### Framework versions
|
54 |
-
|
55 |
-
- Transformers
|
56 |
-
- Pytorch
|
57 |
-
- Datasets
|
58 |
-
- Tokenizers
|
|
|
45 |
|
46 |
Prompt sentences are tokenized and packed together to form 1024 token sequences, following [HF packing algorithm](https://github.com/huggingface/transformers/blob/v4.20.1/examples/pytorch/language-modeling/run_clm.py). No padding is used.
|
47 |
Since the model is trained to predict the next token, labels are simply the input sequence shifted by one token.
|
48 |
+
Given the training format, no extra care is needed to account for different sequences: the model does not need to know which sentence a token belongs to.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|