deveshreddy27 commited on
Commit
ebd28bb
1 Parent(s): c1a8f5c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -1
README.md CHANGED
@@ -60,7 +60,7 @@ Users (both direct and downstream) should be made aware of the risks, biases and
60
 
61
  <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
62
  - Training regime: Mixed precision training using bf16
63
- - Number of epochs: 18
64
  - Learning rate: 1e-6
65
  - Batch size: 16
66
  - Seq length: 512
@@ -75,6 +75,25 @@ Users (both direct and downstream) should be made aware of the risks, biases and
75
  - Intel Gaudi 2 AI Accelerator
76
  - Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz
77
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
78
  #### Software
79
  - Pytorch
80
  - Transformers library
 
60
 
61
  <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
62
  - Training regime: Mixed precision training using bf16
63
+ - Number of epochs: 27
64
  - Learning rate: 1e-6
65
  - Batch size: 16
66
  - Seq length: 512
 
75
  - Intel Gaudi 2 AI Accelerator
76
  - Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz
77
 
78
+
79
+ #### Hardware utilization
80
+ ##### Training
81
+ max_memory_allocated (GB)94.62
82
+ memory_allocated (GB)67.67
83
+ total_memory_available (GB)94.62
84
+ train_loss1.321901714310941
85
+ train_runtime9741.6819
86
+ train_samples_per_second15.877
87
+ train_steps_per_second0.995
88
+
89
+ ##### Inference
90
+ Throughput (including tokenization) = 102.3085449650079 tokens/second
91
+ Number of HPU graphs = 18
92
+ Memory allocated = 15.37 GB
93
+ Max memory allocated = 15.39 GB
94
+ Total memory available = 94.62 GB
95
+ Graph compilation duration = 9.98630401911214 seconds
96
+
97
  #### Software
98
  - Pytorch
99
  - Transformers library