Regarding Choosing TPU over GPU for training

#23
by eshamanideep - opened

Why did you guys choose TPU v4 over GPU A100 80 Gb or H100 any specific reason is it more quick or more economical ? I want to know this to decide GPU or TPU for fine tuning this model thanks

Because we don't have good enough inter-node bandwidth for distributed training using GPUs on GCP, so we chose TPUs.

Sign up or log in to comment