GermanT5
/

t5-efficient-gc4-german-base-nl36

Text2Text Generation

text-generation-inference

Model card Files Files and versions Metrics Training metrics Community

Philip May commited on May 22, 2022

Commit

823f41f

·

1 Parent(s): 03e7c09

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -22,4 +22,5 @@ For example, it does not work on V100 GPUs. On A100, however, it does.
 That is why we suggest to use [DeepSpeed](https://github.com/microsoft/DeepSpeed) for training.
 In particular, we recommend the [ZeRO-3 Example](https://huggingface.co/docs/transformers/main_classes/deepspeed#zero3-example) `auto` configuration.
-> ZeRO-Offload pushes the boundary of the maximum model size that can be trained efficiently using minimal GPU resources, by exploiting computational and memory resources on both GPUs and their host CPUs. see [ZeRO-Offload](https://www.deepspeed.ai/features/#zero-offload)

 That is why we suggest to use [DeepSpeed](https://github.com/microsoft/DeepSpeed) for training.
 In particular, we recommend the [ZeRO-3 Example](https://huggingface.co/docs/transformers/main_classes/deepspeed#zero3-example) `auto` configuration.
+> ZeRO-Offload pushes the boundary of the maximum model size that can be trained efficiently using minimal GPU resources, by exploiting computational and memory resources on both GPUs and their host CPUs.
+see [ZeRO-Offload](https://www.deepspeed.ai/features/#zero-offload)