stas commited on
Commit
220ffd1
1 Parent(s): dad29eb

10h run README

Browse files
Files changed (1) hide show
  1. README.md +12 -0
README.md ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ CodeCarbon wasn't ready until the training was over so we only did an additional 10h run to measure with and the to extrapolate to the whole training.
2
+
3
+ This captures the startup time and 2499 iterations in 2 records, since there was also an intermediary checkpoint saved half-way and we flush the CC
4
+ records on each checkpoint saving.
5
+
6
+ The training had 168000 iterations. Therefore multiply the reported data by 67. This would be quite approximate since we were using 16 nodes when doing
7
+ the ramp up, then 64 and only the last 3 weeks 128 nodes.
8
+
9
+ Each csv file contains a report for a single gpu.
10
+
11
+
12
+