ai-forever
/

FRED-T5-1.7B

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

sberbank-ai commited on Jan 21, 2023

Commit

7b8f31a

·

1 Parent(s): c5ba01c

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -20,6 +20,8 @@ First half of the time model trained on the small part of all datasets (1%,3GB)
 For RSG we trained as described in the T5 paper. First, we trained multitask for all tasks. Then we took the best checkpoint for the task and trained it further.
 Training loss:
 ![Screenshot 2023-01-21 at 11.36.52.png](https://s3.amazonaws.com/moonup/production/uploads/1674290304538-5f91b1208a61a359f44e1851.png)

 For RSG we trained as described in the T5 paper. First, we trained multitask for all tasks. Then we took the best checkpoint for the task and trained it further.
+Total training time was around 45 days on 112 A100 GPUs.
 Training loss:
 ![Screenshot 2023-01-21 at 11.36.52.png](https://s3.amazonaws.com/moonup/production/uploads/1674290304538-5f91b1208a61a359f44e1851.png)