Graphcore
/

gptj-mnli

Text Generation

text-classification

Inference Endpoints

Model card Files Files and versions Community

sofial commited on Aug 23, 2022

Commit

449a711

•

1 Parent(s): ff59f31

Update README.md

Files changed (1) hide show

README.md +1 -11

README.md CHANGED Viewed

@@ -45,14 +45,4 @@ Fine tuned on a Graphcore IPU-POD64 using `popxl`.
 Prompt sentences are tokenized and packed together to form 1024 token sequences, following [HF packing algorithm](https://github.com/huggingface/transformers/blob/v4.20.1/examples/pytorch/language-modeling/run_clm.py). No padding is used.
 Since the model is trained to predict the next token, labels are simply the input sequence shifted by one token.
-Given the training format, no extra care is needed to account for different sequences: the model does not need to know which sentence a token belongs to.
-### Fine-tuning hyperparameters
-The following hyperparameters were used:
-### Framework versions
-- Transformers
-- Pytorch
-- Datasets
-- Tokenizers

 Prompt sentences are tokenized and packed together to form 1024 token sequences, following [HF packing algorithm](https://github.com/huggingface/transformers/blob/v4.20.1/examples/pytorch/language-modeling/run_clm.py). No padding is used.
 Since the model is trained to predict the next token, labels are simply the input sequence shifted by one token.
+Given the training format, no extra care is needed to account for different sequences: the model does not need to know which sentence a token belongs to.