|
Here is a sample truncated output for such configuration: |
|
|
|
*** Starting batch number=1 *** |
|
abs min abs max metadata |
|
shared Embedding |
|
1.01e-06 7.92e+02 weight |
|
0.00e+00 2.47e+04 input[0] |
|
5.36e-05 7.92e+02 output |
|
[] |
|
decoder.dropout Dropout |
|
1.60e-07 2.27e+01 input[0] |
|
0.00e+00 2.52e+01 output |
|
decoder T5Stack |
|
not a tensor output |
|
lm_head Linear |
|
1.01e-06 7.92e+02 weight |
|
0.00e+00 1.11e+00 input[0] |
|
6.06e-02 8.39e+01 output |
|
T5ForConditionalGeneration |
|
not a tensor output |
|
*** Starting batch number=3 *** |
|
|
|
abs min abs max metadata |
|
shared Embedding |
|
1.01e-06 7.92e+02 weight |
|
0.00e+00 2.78e+04 input[0] |
|
5.36e-05 7.92e+02 output |
|
[] |
|
|
|
Here you will get a huge number of frames dumped - as many as there were forward calls in your model, so it may or may |
|
not what you want, but sometimes it can be easier to use for debugging purposes than a normal debugger. |