|
CUDA extension not installed. |
|
Downloading (��)lve/main/config.json: 100%|����������| 662/662 [00:00<00:00, 1.65MB/s] |
|
Downloading pytorch_model.bin: 100%|��������������| 3.13G/3.13G [00:36<00:00, 86.9MB/s] |
|
Some weights of the model checkpoint at google/flan-t5-large were not used when initializing T5EncoderModel: ['decoder.block.4.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.20.layer.1.EncDecAttention.k.weight', 'decoder.block.2.layer.0.SelfAttention.k.weight', 'decoder.block.13.layer.0.SelfAttention.k.weight', 'decoder.block.20.layer.0.SelfAttention.o.weight', 'decoder.block.1.layer.0.SelfAttention.o.weight', 'decoder.block.7.layer.0.SelfAttention.v.weight', 'decoder.block.8.layer.2.layer_norm.weight', 'decoder.embed_tokens.weight', 'decoder.block.23.layer.0.SelfAttention.k.weight', 'decoder.block.17.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.1.layer.1.EncDecAttention.q.weight', 'decoder.block.21.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.2.layer.0.layer_norm.weight', 'decoder.block.21.layer.1.EncDecAttention.k.weight', 'decoder.block.0.layer.0.SelfAttention.relative_attention_bias.weight', 'decoder.block.18.layer.1.EncDecAttention.k.weight', 'decoder.block.9.layer.1.EncDecAttention.k.weight', 'decoder.block.13.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.12.layer.0.layer_norm.weight', 'decoder.block.23.layer.2.DenseReluDense.wo.weight', 'decoder.block.21.layer.1.EncDecAttention.v.weight', 'decoder.block.18.layer.0.SelfAttention.k.weight', 'decoder.block.15.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.20.layer.1.EncDecAttention.v.weight', 'decoder.block.8.layer.1.layer_norm.weight', 'decoder.block.10.layer.1.layer_norm.weight', 'decoder.block.12.layer.1.EncDecAttention.q.weight', 'decoder.block.9.layer.0.layer_norm.weight', 'decoder.block.0.layer.0.layer_norm.weight', 'decoder.block.14.layer.1.EncDecAttention.o.weight', 'decoder.block.3.layer.2.layer_norm.weight', 'decoder.block.23.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.15.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.8.layer.0.SelfAttention.q.weight', 'decoder.block.21.layer.0.SelfAttention.v.weight', 'decoder.block.3.layer.1.layer_norm.weight', 'decoder.block.9.layer.1.EncDecAttention.v.weight', 'decoder.block.12.layer.1.EncDecAttention.o.weight', 'decoder.block.23.layer.0.SelfAttention.q.weight', 'decoder.block.2.layer.0.SelfAttention.v.weight', 'decoder.block.13.layer.0.SelfAttention.o.weight', 'decoder.block.5.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.6.layer.1.EncDecAttention.q.weight', 'decoder.block.3.layer.0.SelfAttention.k.weight', 'decoder.block.2.layer.2.layer_norm.weight', 'decoder.block.1.layer.0.SelfAttention.k.weight', 'decoder.block.15.layer.0.SelfAttention.o.weight', 'decoder.block.1.layer.0.SelfAttention.q.weight', 'decoder.block.4.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.5.layer.0.SelfAttention.k.weight', 'decoder.block.4.layer.1.layer_norm.weight', 'decoder.block.10.layer.1.EncDecAttention.q.weight', 'decoder.block.1.layer.0.layer_norm.weight', 'decoder.block.11.layer.0.SelfAttention.v.weight', 'decoder.block.23.layer.1.EncDecAttention.v.weight', 'decoder.block.8.layer.1.EncDecAttention.o.weight', 'decoder.block.3.layer.0.SelfAttention.o.weight', 'decoder.block.9.layer.2.DenseReluDense.wo.weight', 'decoder.block.16.layer.1.EncDecAttention.v.weight', 'decoder.block.18.layer.0.layer_norm.weight', 'decoder.block.11.layer.1.layer_norm.weight', 'decoder.block.22.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.20.layer.1.layer_norm.weight', 'decoder.block.11.layer.1.EncDecAttention.v.weight', 'decoder.block.9.layer.0.SelfAttention.o.weight', 'decoder.block.3.layer.0.SelfAttention.q.weight', 'decoder.block.11.layer.0.layer_norm.weight', 'decoder.block.7.layer.0.layer_norm.weight', 'decoder.block.13.layer.0.SelfAttention.v.weight', 'decoder.block.21.layer.2.layer_norm.weight', 'decoder.block.20.layer.1.EncDecAttention.o.weight', 'decoder.block.16.layer.0.SelfAttention.q.weight', 'decoder.block.16.layer.0.SelfAttention.v.weight', 'decoder.block.17.layer.2.DenseReluDense.wo.weight', 'decoder.block.6.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.22.layer.2.layer_norm.weight', 'decoder.block.19.layer.2.layer_norm.weight', 'decoder.block.8.layer.0.SelfAttention.k.weight', 'decoder.block.10.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.0.layer.2.DenseReluDense.wo.weight', 'decoder.block.13.layer.0.SelfAttention.q.weight', 'decoder.block.17.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.23.layer.0.layer_norm.weight', 'decoder.block.19.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.5.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.5.layer.0.SelfAttention.v.weight', 'decoder.block.1.layer.0.SelfAttention.v.weight', 'decoder.block.15.layer.1.EncDecAttention.k.weight', 'decoder.block.23.layer.0.SelfAttention.o.weight', 'decoder.block.3.layer.2.DenseReluDense.wo.weight', 'decoder.block.17.layer.1.EncDecAttention.v.weight', 'decoder.block.7.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.9.layer.0.SelfAttention.q.weight', 'decoder.block.0.layer.2.layer_norm.weight', 'decoder.block.7.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.19.layer.1.EncDecAttention.o.weight', 'decoder.block.11.layer.1.EncDecAttention.q.weight', 'decoder.block.3.layer.0.SelfAttention.v.weight', 'decoder.block.18.layer.2.DenseReluDense.wo.weight', 'decoder.block.11.layer.0.SelfAttention.k.weight', 'decoder.block.6.layer.0.SelfAttention.q.weight', 'decoder.block.9.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.0.layer.1.EncDecAttention.o.weight', 'decoder.block.7.layer.1.layer_norm.weight', 'decoder.block.22.layer.0.SelfAttention.v.weight', 'decoder.block.15.layer.1.layer_norm.weight', 'decoder.block.20.layer.2.DenseReluDense.wo.weight', 'decoder.block.14.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.18.layer.1.EncDecAttention.q.weight', 'decoder.block.23.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.8.layer.1.EncDecAttention.v.weight', 'decoder.block.6.layer.1.EncDecAttention.v.weight', 'decoder.block.18.layer.1.EncDecAttention.v.weight', 'decoder.block.10.layer.1.EncDecAttention.o.weight', 'decoder.block.21.layer.2.DenseReluDense.wo.weight', 'decoder.block.21.layer.0.layer_norm.weight', 'decoder.block.22.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.5.layer.0.layer_norm.weight', 'decoder.block.23.layer.1.EncDecAttention.o.weight', 'decoder.block.17.layer.0.SelfAttention.q.weight', 'decoder.block.22.layer.1.EncDecAttention.k.weight', 'decoder.block.4.layer.0.layer_norm.weight', 'decoder.block.0.layer.1.layer_norm.weight', 'decoder.block.1.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.5.layer.1.EncDecAttention.k.weight', 'decoder.block.17.layer.1.EncDecAttention.k.weight', 'decoder.block.13.layer.1.EncDecAttention.k.weight', 'decoder.block.19.layer.0.SelfAttention.k.weight', 'decoder.block.7.layer.1.EncDecAttention.k.weight', 'decoder.block.7.layer.0.SelfAttention.o.weight', 'decoder.block.5.layer.2.DenseReluDense.wo.weight', 'decoder.block.7.layer.0.SelfAttention.k.weight', 'decoder.block.12.layer.1.layer_norm.weight', 'decoder.block.11.layer.0.SelfAttention.q.weight', 'decoder.block.20.layer.0.SelfAttention.v.weight', 'decoder.block.12.layer.2.layer_norm.weight', 'decoder.block.17.layer.1.EncDecAttention.q.weight', 'decoder.block.8.layer.0.layer_norm.weight', 'decoder.block.11.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.13.layer.2.DenseReluDense.wo.weight', 'decoder.block.21.layer.0.SelfAttention.k.weight', 'decoder.block.23.layer.0.SelfAttention.v.weight', 'decoder.block.20.layer.0.SelfAttention.k.weight', 'decoder.block.22.layer.0.SelfAttention.k.weight', 'decoder.block.14.layer.1.EncDecAttention.q.weight', 'decoder.block.15.layer.0.SelfAttention.q.weight', 'decoder.block.21.layer.1.EncDecAttention.q.weight', 'decoder.block.21.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.16.layer.1.EncDecAttention.k.weight', 'decoder.block.18.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.1.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.0.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.6.layer.0.SelfAttention.k.weight', 'decoder.block.15.layer.1.EncDecAttention.v.weight', 'decoder.block.17.layer.1.EncDecAttention.o.weight', 'decoder.block.20.layer.0.SelfAttention.q.weight', 'decoder.block.4.layer.1.EncDecAttention.k.weight', 'decoder.block.17.layer.0.SelfAttention.k.weight', 'decoder.block.0.layer.1.EncDecAttention.k.weight', 'decoder.block.19.layer.1.EncDecAttention.q.weight', 'decoder.block.12.layer.1.EncDecAttention.k.weight', 'decoder.block.16.layer.1.EncDecAttention.q.weight', 'decoder.block.4.layer.1.EncDecAttention.v.weight', 'decoder.block.22.layer.1.EncDecAttention.o.weight', 'decoder.block.5.layer.1.EncDecAttention.q.weight', 'lm_head.weight', 'decoder.block.17.layer.2.layer_norm.weight', 'decoder.block.18.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.8.layer.1.EncDecAttention.q.weight', 'decoder.block.10.layer.1.EncDecAttention.k.weight', 'decoder.block.1.layer.2.layer_norm.weight', 'decoder.block.19.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.17.layer.0.SelfAttention.o.weight', 'decoder.block.14.layer.1.EncDecAttention.k.weight', 'decoder.block.21.layer.0.SelfAttention.q.weight', 'decoder.block.10.layer.0.SelfAttention.k.weight', 'decoder.block.22.layer.0.layer_norm.weight', 'decoder.block.21.layer.1.EncDecAttention.o.weight', 'decoder.block.1.layer.2.DenseReluDense.wo.weight', 'decoder.block.14.layer.0.SelfAttention.v.weight', 'decoder.block.22.layer.1.EncDecAttention.q.weight', 'decoder.block.7.layer.2.layer_norm.weight', 'decoder.block.9.layer.0.SelfAttention.k.weight', 'decoder.block.4.layer.0.SelfAttention.o.weight', 'decoder.block.5.layer.1.layer_norm.weight', 'decoder.block.23.layer.2.layer_norm.weight', 'decoder.block.17.layer.0.layer_norm.weight', 'decoder.block.14.layer.0.SelfAttention.o.weight', 'decoder.block.14.layer.2.layer_norm.weight', 'decoder.block.5.layer.2.layer_norm.weight', 'decoder.block.4.layer.0.SelfAttention.k.weight', 'decoder.block.0.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.9.layer.1.layer_norm.weight', 'decoder.block.20.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.14.layer.2.DenseReluDense.wo.weight', 'decoder.block.7.layer.1.EncDecAttention.v.weight', 'decoder.block.16.layer.1.layer_norm.weight', 'decoder.block.2.layer.0.SelfAttention.q.weight', 'decoder.block.19.layer.0.SelfAttention.v.weight', 'decoder.block.6.layer.0.SelfAttention.v.weight', 'decoder.block.7.layer.1.EncDecAttention.o.weight', 'decoder.block.5.layer.0.SelfAttention.q.weight', 'decoder.block.15.layer.0.SelfAttention.k.weight', 'decoder.block.19.layer.1.EncDecAttention.k.weight', 'decoder.block.6.layer.1.layer_norm.weight', 'decoder.block.22.layer.1.EncDecAttention.v.weight', 'decoder.block.5.layer.0.SelfAttention.o.weight', 'decoder.block.14.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.16.layer.2.layer_norm.weight', 'decoder.block.4.layer.1.EncDecAttention.o.weight', 'decoder.block.3.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.15.layer.0.layer_norm.weight', 'decoder.block.16.layer.0.SelfAttention.k.weight', 'decoder.block.23.layer.1.layer_norm.weight', 'decoder.block.8.layer.0.SelfAttention.v.weight', 'decoder.block.2.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.12.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.2.layer.1.layer_norm.weight', 'decoder.block.13.layer.1.EncDecAttention.v.weight', 'decoder.block.9.layer.0.SelfAttention.v.weight', 'decoder.block.3.layer.1.EncDecAttention.v.weight', 'decoder.block.20.layer.0.layer_norm.weight', 'decoder.block.13.layer.2.layer_norm.weight', 'decoder.block.16.layer.2.DenseReluDense.wo.weight', 'decoder.block.14.layer.1.EncDecAttention.v.weight', 'decoder.block.6.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.6.layer.2.layer_norm.weight', 'decoder.block.21.layer.0.SelfAttention.o.weight', 'decoder.block.8.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.4.layer.2.DenseReluDense.wo.weight', 'decoder.block.12.layer.0.SelfAttention.o.weight', 'decoder.block.6.layer.0.SelfAttention.o.weight', 'decoder.block.11.layer.2.layer_norm.weight', 'decoder.block.12.layer.1.EncDecAttention.v.weight', 'decoder.block.22.layer.0.SelfAttention.q.weight', 'decoder.block.19.layer.0.SelfAttention.q.weight', 'decoder.block.16.layer.1.EncDecAttention.o.weight', 'decoder.block.1.layer.1.layer_norm.weight', 'decoder.block.17.layer.0.SelfAttention.v.weight', 'decoder.block.6.layer.2.DenseReluDense.wo.weight', 'decoder.block.10.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.18.layer.0.SelfAttention.o.weight', 'decoder.block.19.layer.1.EncDecAttention.v.weight', 'decoder.block.14.layer.0.layer_norm.weight', 'decoder.block.12.layer.0.SelfAttention.v.weight', 'decoder.block.7.layer.2.DenseReluDense.wo.weight', 'decoder.block.2.layer.1.EncDecAttention.o.weight', 'decoder.block.10.layer.0.layer_norm.weight', 'decoder.block.9.layer.1.EncDecAttention.q.weight', 'decoder.block.12.layer.0.SelfAttention.k.weight', 'decoder.block.10.layer.0.SelfAttention.v.weight', 'decoder.block.8.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.0.layer.0.SelfAttention.o.weight', 'decoder.block.3.layer.1.EncDecAttention.o.weight', 'decoder.block.11.layer.1.EncDecAttention.k.weight', 'decoder.block.18.layer.0.SelfAttention.q.weight', 'decoder.block.4.layer.2.layer_norm.weight', 'decoder.block.19.layer.2.DenseReluDense.wo.weight', 'decoder.block.3.layer.1.EncDecAttention.q.weight', 'decoder.block.22.layer.2.DenseReluDense.wo.weight', 'decoder.block.14.layer.0.SelfAttention.q.weight', 'decoder.block.13.layer.1.layer_norm.weight', 'decoder.block.6.layer.0.layer_norm.weight', 'decoder.block.4.layer.0.SelfAttention.q.weight', 'decoder.block.19.layer.0.layer_norm.weight', 'decoder.block.3.layer.0.layer_norm.weight', 'decoder.block.2.layer.1.EncDecAttention.v.weight', 'decoder.block.23.layer.1.EncDecAttention.k.weight', 'decoder.block.20.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.2.layer.1.EncDecAttention.q.weight', 'decoder.block.10.layer.1.EncDecAttention.v.weight', 'decoder.block.16.layer.0.layer_norm.weight', 'decoder.block.18.layer.0.SelfAttention.v.weight', 'decoder.block.12.layer.0.SelfAttention.q.weight', 'decoder.block.2.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.5.layer.1.EncDecAttention.o.weight', 'decoder.block.3.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.13.layer.1.EncDecAttention.o.weight', 'decoder.block.8.layer.1.EncDecAttention.k.weight', 'decoder.block.2.layer.0.SelfAttention.o.weight', 'decoder.block.2.layer.2.DenseReluDense.wo.weight', 'decoder.block.9.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.15.layer.2.DenseReluDense.wo.weight', 'decoder.block.4.layer.1.EncDecAttention.q.weight', 'decoder.block.7.layer.0.SelfAttention.q.weight', 'decoder.block.13.layer.1.EncDecAttention.q.weight', 'decoder.block.5.layer.1.EncDecAttention.v.weight', 'decoder.block.17.layer.1.layer_norm.weight', 'decoder.block.16.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.11.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.15.layer.1.EncDecAttention.o.weight', 'decoder.block.10.layer.2.DenseReluDense.wo.weight', 'decoder.block.13.layer.2.DenseReluDense.wi_0.weight', 'decoder.block.0.layer.0.SelfAttention.q.weight', 'decoder.block.14.layer.1.layer_norm.weight', 'decoder.block.19.layer.0.SelfAttention.o.weight', 'decoder.block.13.layer.0.layer_norm.weight', 'decoder.block.6.layer.1.EncDecAttention.o.weight', 'decoder.block.8.layer.0.SelfAttention.o.weight', 'decoder.block.22.layer.1.layer_norm.weight', 'decoder.block.8.layer.2.DenseReluDense.wo.weight', 'decoder.block.19.layer.1.layer_norm.weight', 'decoder.block.21.layer.1.layer_norm.weight', 'decoder.block.0.layer.0.SelfAttention.v.weight', 'decoder.block.0.layer.0.SelfAttention.k.weight', 'decoder.block.16.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.2.layer.1.EncDecAttention.k.weight', 'decoder.block.18.layer.1.layer_norm.weight', 'decoder.block.1.layer.1.EncDecAttention.k.weight', 'decoder.block.11.layer.2.DenseReluDense.wo.weight', 'decoder.block.18.layer.2.layer_norm.weight', 'decoder.block.16.layer.0.SelfAttention.o.weight', 'decoder.block.12.layer.2.DenseReluDense.wo.weight', 'decoder.block.11.layer.0.SelfAttention.o.weight', 'decoder.block.9.layer.2.layer_norm.weight', 'decoder.block.18.layer.1.EncDecAttention.o.weight', 'decoder.block.9.layer.1.EncDecAttention.o.weight', 'decoder.block.20.layer.1.EncDecAttention.q.weight', 'decoder.block.4.layer.0.SelfAttention.v.weight', 'decoder.block.7.layer.1.EncDecAttention.q.weight', 'decoder.block.1.layer.1.EncDecAttention.v.weight', 'decoder.block.1.layer.1.EncDecAttention.o.weight', 'decoder.block.0.layer.1.EncDecAttention.q.weight', 'decoder.block.15.layer.0.SelfAttention.v.weight', 'decoder.block.10.layer.0.SelfAttention.o.weight', 'decoder.block.15.layer.2.layer_norm.weight', 'decoder.block.0.layer.1.EncDecAttention.v.weight', 'decoder.block.14.layer.0.SelfAttention.k.weight', 'decoder.block.22.layer.0.SelfAttention.o.weight', 'decoder.block.20.layer.2.layer_norm.weight', 'decoder.block.10.layer.2.layer_norm.weight', 'decoder.block.6.layer.1.EncDecAttention.k.weight', 'decoder.block.10.layer.0.SelfAttention.q.weight', 'decoder.final_layer_norm.weight', 'decoder.block.15.layer.1.EncDecAttention.q.weight', 'decoder.block.3.layer.1.EncDecAttention.k.weight', 'decoder.block.12.layer.2.DenseReluDense.wi_1.weight', 'decoder.block.23.layer.1.EncDecAttention.q.weight', 'decoder.block.11.layer.1.EncDecAttention.o.weight'] |
|
- This IS expected if you are initializing T5EncoderModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). |
|
- This IS NOT expected if you are initializing T5EncoderModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). |
|
Found cached dataset wikitext (/root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126) |
|
Found cached dataset wikitext (/root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126) |
|
Downloading (��)okenizer_config.json: 100%|��| 2.54k/2.54k [00:00<00:00, 9.09MB/s] |
|
Downloading spiece.model: 100%|����������������������������| 792k/792k [00:00<00:00, 28.7MB/s] |
|
Downloading (��)cial_tokens_map.json: 100%|��| 2.20k/2.20k [00:00<00:00, 7.83MB/s] |
|
Token indices sequence length is longer than the specified maximum sequence length for this model (2837981 > 512). Running this sequence through the model will result in indexing errors |
|
Starting ... |
|
Ready. |
|
0 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.55 |
|
error 142.37025451660156 |
|
0 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.25 |
|
error 9521.5029296875 |
|
0 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.26 |
|
error 2544.900390625 |
|
0 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.28 |
|
error 123186.2578125 |
|
0 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.25 |
|
error 11158.978515625 |
|
0 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.25 |
|
error 9518.11328125 |
|
0 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.72 |
|
error 3637286.0 |
|
1 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.39 |
|
error 536.7674560546875 |
|
1 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.25 |
|
error 25588.546875 |
|
1 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.25 |
|
error 1919.272216796875 |
|
1 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.25 |
|
error 47080.5625 |
|
1 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.25 |
|
error 9808.359375 |
|
1 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.25 |
|
error 6298.18896484375 |
|
1 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.71 |
|
error 137391.875 |
|
2 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.41 |
|
error 125.06156921386719 |
|
2 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.25 |
|
error 6493.82568359375 |
|
2 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.25 |
|
error 1306.6259765625 |
|
2 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.25 |
|
error 3543.05029296875 |
|
2 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.25 |
|
error 10326.599609375 |
|
2 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.25 |
|
error 8165.3193359375 |
|
2 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.69 |
|
error 105276.7265625 |
|
3 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.41 |
|
error 137.07083129882812 |
|
3 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.25 |
|
error 7485.19384765625 |
|
3 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.25 |
|
error 1563.48095703125 |
|
3 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.27 |
|
error 3057.40673828125 |
|
3 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.25 |
|
error 10634.482421875 |
|
3 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.27 |
|
error 9444.2841796875 |
|
3 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.73 |
|
error 105683.125 |
|
4 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.41 |
|
error 133.7151336669922 |
|
4 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.27 |
|
error 7297.93896484375 |
|
4 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.25 |
|
error 1610.62939453125 |
|
4 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.25 |
|
error 7214.41796875 |
|
4 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.25 |
|
error 14451.642578125 |
|
4 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.25 |
|
error 15960.328125 |
|
4 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.69 |
|
error 4980679168.0 |
|
5 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.39 |
|
error 140.4214324951172 |
|
5 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.25 |
|
error 7479.8193359375 |
|
5 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.25 |
|
error 2484.518310546875 |
|
5 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.25 |
|
error 8618.46484375 |
|
5 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.27 |
|
error 10754.0419921875 |
|
5 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.25 |
|
error 13012.9423828125 |
|
5 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.69 |
|
error 107111.1875 |
|
6 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.40 |
|
error 112.6629867553711 |
|
6 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.25 |
|
error 7047.806640625 |
|
6 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.25 |
|
error 2059.9892578125 |
|
6 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.25 |
|
error 5445.0029296875 |
|
6 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.26 |
|
error 11107.181640625 |
|
6 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.25 |
|
error 15983.3603515625 |
|
6 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.70 |
|
error 685753216.0 |
|
7 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.41 |
|
error 133.351806640625 |
|
7 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.25 |
|
error 8262.615234375 |
|
7 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.26 |
|
error 2878.16943359375 |
|
7 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.27 |
|
error 17972.373046875 |
|
7 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.25 |
|
error 11895.857421875 |
|
7 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.25 |
|
error 18337.82421875 |
|
7 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.72 |
|
error 25902379008.0 |
|
8 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.39 |
|
error 120.18170928955078 |
|
8 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.25 |
|
error 7699.7255859375 |
|
8 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.25 |
|
error 2972.5712890625 |
|
8 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.25 |
|
error 8750.123046875 |
|
8 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.25 |
|
error 11126.8662109375 |
|
8 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.25 |
|
error 18306.9609375 |
|
8 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.71 |
|
error 128990.28125 |
|
9 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.39 |
|
error 126.16083526611328 |
|
9 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.25 |
|
error 8584.9208984375 |
|
9 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.26 |
|
error 3245.54541015625 |
|
9 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.25 |
|
error 15868.41015625 |
|
9 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.25 |
|
error 9290.447265625 |
|
9 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.25 |
|
error 17894.17578125 |
|
9 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.71 |
|
error 149863.296875 |
|
10 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.39 |
|
error 107.48172760009766 |
|
10 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.27 |
|
error 6898.35595703125 |
|
10 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.26 |
|
error 3770.64990234375 |
|
10 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.25 |
|
error 17137.037109375 |
|
10 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.27 |
|
error 8128.5166015625 |
|
10 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.26 |
|
error 17371.587890625 |
|
10 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.73 |
|
error 116027.1015625 |
|
11 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.40 |
|
error 104.61625671386719 |
|
11 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.25 |
|
error 7259.4208984375 |
|
11 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.25 |
|
error 5005.52490234375 |
|
11 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.27 |
|
error 32728.1015625 |
|
11 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.25 |
|
error 8535.056640625 |
|
11 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.27 |
|
error 22538.978515625 |
|
11 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.71 |
|
error 170254.40625 |
|
12 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.39 |
|
error 94.82140350341797 |
|
12 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.25 |
|
error 6448.5205078125 |
|
12 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.25 |
|
error 5083.41796875 |
|
12 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.25 |
|
error 60036.953125 |
|
12 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.26 |
|
error 7829.4384765625 |
|
12 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.26 |
|
error 23411.65234375 |
|
12 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.69 |
|
error 231657.15625 |
|
13 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.39 |
|
error 90.77069091796875 |
|
13 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.25 |
|
error 5828.037109375 |
|
13 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.26 |
|
error 4888.35302734375 |
|
13 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.25 |
|
error 41515.46484375 |
|
13 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.25 |
|
error 7063.1728515625 |
|
13 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.25 |
|
error 23648.7421875 |
|
13 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.69 |
|
error 261193.75 |
|
14 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.39 |
|
error 77.24964904785156 |
|
14 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.27 |
|
error 5096.2626953125 |
|
14 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.26 |
|
error 6915.9384765625 |
|
14 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.26 |
|
error 56402.62890625 |
|
14 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.28 |
|
error 6039.11328125 |
|
14 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.25 |
|
error 24090.625 |
|
14 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.71 |
|
error 355204.3125 |
|
15 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.39 |
|
error 72.92942810058594 |
|
15 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.25 |
|
error 5561.1201171875 |
|
15 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.25 |
|
error 8621.376953125 |
|
15 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.25 |
|
error 146386.5625 |
|
15 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.25 |
|
error 5684.064453125 |
|
15 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.25 |
|
error 26869.12109375 |
|
15 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.70 |
|
error 361036.25 |
|
16 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.39 |
|
error 75.83228302001953 |
|
16 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.25 |
|
error 5176.50341796875 |
|
16 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.25 |
|
error 9754.8203125 |
|
16 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.25 |
|
error 231755.03125 |
|
16 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.27 |
|
error 5699.75390625 |
|
16 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.25 |
|
error 25039.771484375 |
|
16 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.69 |
|
error 651520.75 |
|
17 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.39 |
|
error 61.858299255371094 |
|
17 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.25 |
|
error 4369.08251953125 |
|
17 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.25 |
|
error 12425.16796875 |
|
17 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.25 |
|
error 408129.875 |
|
17 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.25 |
|
error 5317.8798828125 |
|
17 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.25 |
|
error 26979.31640625 |
|
17 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.73 |
|
error 689154.875 |
|
18 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.41 |
|
error 68.12550354003906 |
|
18 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.27 |
|
error 4010.4833984375 |
|
18 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.26 |
|
error 14657.2314453125 |
|
18 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.25 |
|
error 206627.5 |
|
18 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.28 |
|
error 6068.525390625 |
|
18 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.25 |
|
error 28093.669921875 |
|
18 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.72 |
|
error 1019951.8125 |
|
19 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.41 |
|
error 57.68662643432617 |
|
19 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.25 |
|
error 4086.83349609375 |
|
19 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.25 |
|
error 14453.2578125 |
|
19 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.25 |
|
error 460674.0 |
|
19 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.25 |
|
error 5235.9794921875 |
|
19 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.26 |
|
error 28788.4765625 |
|
19 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.70 |
|
error 1332541.0 |
|
20 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.39 |
|
error 42.9056510925293 |
|
20 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.25 |
|
error 2894.2177734375 |
|
20 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.25 |
|
error 16684.044921875 |
|
20 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.25 |
|
error 557086.6875 |
|
20 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.25 |
|
error 6791.15625 |
|
20 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.25 |
|
error 38994.37890625 |
|
20 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.69 |
|
error 2295082.0 |
|
21 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.41 |
|
error 58.024559020996094 |
|
21 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.25 |
|
error 3534.38427734375 |
|
21 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.28 |
|
error 23622.609375 |
|
21 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.26 |
|
error 630538.75 |
|
21 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.27 |
|
error 6944.4306640625 |
|
21 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.25 |
|
error 41437.5546875 |
|
21 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.72 |
|
error 2805766.25 |
|
22 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.39 |
|
error 56.98418426513672 |
|
22 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.27 |
|
error 2588.40576171875 |
|
22 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.26 |
|
error 33727.3125 |
|
22 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.26 |
|
error 1536184.5 |
|
22 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.28 |
|
error 7638.18701171875 |
|
22 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.25 |
|
error 49872.0859375 |
|
22 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.69 |
|
error 4077312.5 |
|
23 layer.0.SelfAttention.q |
|
Quantizing ... |
|
time 0.40 |
|
error 53.174556732177734 |
|
23 layer.0.SelfAttention.k |
|
Quantizing ... |
|
time 0.26 |
|
error 2663.560302734375 |
|
23 layer.0.SelfAttention.v |
|
Quantizing ... |
|
time 0.27 |
|
error 35553.75 |
|
23 layer.0.SelfAttention.o |
|
Quantizing ... |
|
time 0.26 |
|
error 1983365.75 |
|
23 layer.1.DenseReluDense.wi_0 |
|
Quantizing ... |
|
time 0.25 |
|
error 8208.654296875 |
|
23 layer.1.DenseReluDense.wi_1 |
|
Quantizing ... |
|
time 0.25 |
|
error 51633.640625 |
|
23 layer.1.DenseReluDense.wo |
|
Quantizing ... |
|
time 0.69 |
|
error 8843078.0 |
|
114.8298749923706 |
|
Packing ... |
|
encoder.block.0.layer.0.SelfAttention.q |
|
encoder.block.0.layer.0.SelfAttention.k |
|
encoder.block.0.layer.0.SelfAttention.v |
|
encoder.block.0.layer.0.SelfAttention.o |
|
encoder.block.0.layer.1.DenseReluDense.wi_0 |
|
encoder.block.0.layer.1.DenseReluDense.wi_1 |
|
encoder.block.0.layer.1.DenseReluDense.wo |
|
encoder.block.1.layer.0.SelfAttention.q |
|
encoder.block.1.layer.0.SelfAttention.k |
|
encoder.block.1.layer.0.SelfAttention.v |
|
encoder.block.1.layer.0.SelfAttention.o |
|
encoder.block.1.layer.1.DenseReluDense.wi_0 |
|
encoder.block.1.layer.1.DenseReluDense.wi_1 |
|
encoder.block.1.layer.1.DenseReluDense.wo |
|
encoder.block.2.layer.0.SelfAttention.q |
|
encoder.block.2.layer.0.SelfAttention.k |
|
encoder.block.2.layer.0.SelfAttention.v |
|
encoder.block.2.layer.0.SelfAttention.o |
|
encoder.block.2.layer.1.DenseReluDense.wi_0 |
|
encoder.block.2.layer.1.DenseReluDense.wi_1 |
|
encoder.block.2.layer.1.DenseReluDense.wo |
|
encoder.block.3.layer.0.SelfAttention.q |
|
encoder.block.3.layer.0.SelfAttention.k |
|
encoder.block.3.layer.0.SelfAttention.v |
|
encoder.block.3.layer.0.SelfAttention.o |
|
encoder.block.3.layer.1.DenseReluDense.wi_0 |
|
encoder.block.3.layer.1.DenseReluDense.wi_1 |
|
encoder.block.3.layer.1.DenseReluDense.wo |
|
encoder.block.4.layer.0.SelfAttention.q |
|
encoder.block.4.layer.0.SelfAttention.k |
|
encoder.block.4.layer.0.SelfAttention.v |
|
encoder.block.4.layer.0.SelfAttention.o |
|
encoder.block.4.layer.1.DenseReluDense.wi_0 |
|
encoder.block.4.layer.1.DenseReluDense.wi_1 |
|
encoder.block.4.layer.1.DenseReluDense.wo |
|
encoder.block.5.layer.0.SelfAttention.q |
|
encoder.block.5.layer.0.SelfAttention.k |
|
encoder.block.5.layer.0.SelfAttention.v |
|
encoder.block.5.layer.0.SelfAttention.o |
|
encoder.block.5.layer.1.DenseReluDense.wi_0 |
|
encoder.block.5.layer.1.DenseReluDense.wi_1 |
|
encoder.block.5.layer.1.DenseReluDense.wo |
|
encoder.block.6.layer.0.SelfAttention.q |
|
encoder.block.6.layer.0.SelfAttention.k |
|
encoder.block.6.layer.0.SelfAttention.v |
|
encoder.block.6.layer.0.SelfAttention.o |
|
encoder.block.6.layer.1.DenseReluDense.wi_0 |
|
encoder.block.6.layer.1.DenseReluDense.wi_1 |
|
encoder.block.6.layer.1.DenseReluDense.wo |
|
encoder.block.7.layer.0.SelfAttention.q |
|
encoder.block.7.layer.0.SelfAttention.k |
|
encoder.block.7.layer.0.SelfAttention.v |
|
encoder.block.7.layer.0.SelfAttention.o |
|
encoder.block.7.layer.1.DenseReluDense.wi_0 |
|
encoder.block.7.layer.1.DenseReluDense.wi_1 |
|
encoder.block.7.layer.1.DenseReluDense.wo |
|
encoder.block.8.layer.0.SelfAttention.q |
|
encoder.block.8.layer.0.SelfAttention.k |
|
encoder.block.8.layer.0.SelfAttention.v |
|
encoder.block.8.layer.0.SelfAttention.o |
|
encoder.block.8.layer.1.DenseReluDense.wi_0 |
|
encoder.block.8.layer.1.DenseReluDense.wi_1 |
|
encoder.block.8.layer.1.DenseReluDense.wo |
|
encoder.block.9.layer.0.SelfAttention.q |
|
encoder.block.9.layer.0.SelfAttention.k |
|
encoder.block.9.layer.0.SelfAttention.v |
|
encoder.block.9.layer.0.SelfAttention.o |
|
encoder.block.9.layer.1.DenseReluDense.wi_0 |
|
encoder.block.9.layer.1.DenseReluDense.wi_1 |
|
encoder.block.9.layer.1.DenseReluDense.wo |
|
encoder.block.10.layer.0.SelfAttention.q |
|
encoder.block.10.layer.0.SelfAttention.k |
|
encoder.block.10.layer.0.SelfAttention.v |
|
encoder.block.10.layer.0.SelfAttention.o |
|
encoder.block.10.layer.1.DenseReluDense.wi_0 |
|
encoder.block.10.layer.1.DenseReluDense.wi_1 |
|
encoder.block.10.layer.1.DenseReluDense.wo |
|
encoder.block.11.layer.0.SelfAttention.q |
|
encoder.block.11.layer.0.SelfAttention.k |
|
encoder.block.11.layer.0.SelfAttention.v |
|
encoder.block.11.layer.0.SelfAttention.o |
|
encoder.block.11.layer.1.DenseReluDense.wi_0 |
|
encoder.block.11.layer.1.DenseReluDense.wi_1 |
|
encoder.block.11.layer.1.DenseReluDense.wo |
|
encoder.block.12.layer.0.SelfAttention.q |
|
encoder.block.12.layer.0.SelfAttention.k |
|
encoder.block.12.layer.0.SelfAttention.v |
|
encoder.block.12.layer.0.SelfAttention.o |
|
encoder.block.12.layer.1.DenseReluDense.wi_0 |
|
encoder.block.12.layer.1.DenseReluDense.wi_1 |
|
encoder.block.12.layer.1.DenseReluDense.wo |
|
encoder.block.13.layer.0.SelfAttention.q |
|
encoder.block.13.layer.0.SelfAttention.k |
|
encoder.block.13.layer.0.SelfAttention.v |
|
encoder.block.13.layer.0.SelfAttention.o |
|
encoder.block.13.layer.1.DenseReluDense.wi_0 |
|
encoder.block.13.layer.1.DenseReluDense.wi_1 |
|
encoder.block.13.layer.1.DenseReluDense.wo |
|
encoder.block.14.layer.0.SelfAttention.q |
|
encoder.block.14.layer.0.SelfAttention.k |
|
encoder.block.14.layer.0.SelfAttention.v |
|
encoder.block.14.layer.0.SelfAttention.o |
|
encoder.block.14.layer.1.DenseReluDense.wi_0 |
|
encoder.block.14.layer.1.DenseReluDense.wi_1 |
|
encoder.block.14.layer.1.DenseReluDense.wo |
|
encoder.block.15.layer.0.SelfAttention.q |
|
encoder.block.15.layer.0.SelfAttention.k |
|
encoder.block.15.layer.0.SelfAttention.v |
|
encoder.block.15.layer.0.SelfAttention.o |
|
encoder.block.15.layer.1.DenseReluDense.wi_0 |
|
encoder.block.15.layer.1.DenseReluDense.wi_1 |
|
encoder.block.15.layer.1.DenseReluDense.wo |
|
encoder.block.16.layer.0.SelfAttention.q |
|
encoder.block.16.layer.0.SelfAttention.k |
|
encoder.block.16.layer.0.SelfAttention.v |
|
encoder.block.16.layer.0.SelfAttention.o |
|
encoder.block.16.layer.1.DenseReluDense.wi_0 |
|
encoder.block.16.layer.1.DenseReluDense.wi_1 |
|
encoder.block.16.layer.1.DenseReluDense.wo |
|
encoder.block.17.layer.0.SelfAttention.q |
|
encoder.block.17.layer.0.SelfAttention.k |
|
encoder.block.17.layer.0.SelfAttention.v |
|
encoder.block.17.layer.0.SelfAttention.o |
|
encoder.block.17.layer.1.DenseReluDense.wi_0 |
|
encoder.block.17.layer.1.DenseReluDense.wi_1 |
|
encoder.block.17.layer.1.DenseReluDense.wo |
|
encoder.block.18.layer.0.SelfAttention.q |
|
encoder.block.18.layer.0.SelfAttention.k |
|
encoder.block.18.layer.0.SelfAttention.v |
|
encoder.block.18.layer.0.SelfAttention.o |
|
encoder.block.18.layer.1.DenseReluDense.wi_0 |
|
encoder.block.18.layer.1.DenseReluDense.wi_1 |
|
encoder.block.18.layer.1.DenseReluDense.wo |
|
encoder.block.19.layer.0.SelfAttention.q |
|
encoder.block.19.layer.0.SelfAttention.k |
|
encoder.block.19.layer.0.SelfAttention.v |
|
encoder.block.19.layer.0.SelfAttention.o |
|
encoder.block.19.layer.1.DenseReluDense.wi_0 |
|
encoder.block.19.layer.1.DenseReluDense.wi_1 |
|
encoder.block.19.layer.1.DenseReluDense.wo |
|
encoder.block.20.layer.0.SelfAttention.q |
|
encoder.block.20.layer.0.SelfAttention.k |
|
encoder.block.20.layer.0.SelfAttention.v |
|
encoder.block.20.layer.0.SelfAttention.o |
|
encoder.block.20.layer.1.DenseReluDense.wi_0 |
|
encoder.block.20.layer.1.DenseReluDense.wi_1 |
|
encoder.block.20.layer.1.DenseReluDense.wo |
|
encoder.block.21.layer.0.SelfAttention.q |
|
encoder.block.21.layer.0.SelfAttention.k |
|
encoder.block.21.layer.0.SelfAttention.v |
|
encoder.block.21.layer.0.SelfAttention.o |
|
encoder.block.21.layer.1.DenseReluDense.wi_0 |
|
encoder.block.21.layer.1.DenseReluDense.wi_1 |
|
encoder.block.21.layer.1.DenseReluDense.wo |
|
encoder.block.22.layer.0.SelfAttention.q |
|
encoder.block.22.layer.0.SelfAttention.k |
|
encoder.block.22.layer.0.SelfAttention.v |
|
encoder.block.22.layer.0.SelfAttention.o |
|
encoder.block.22.layer.1.DenseReluDense.wi_0 |
|
encoder.block.22.layer.1.DenseReluDense.wi_1 |
|
encoder.block.22.layer.1.DenseReluDense.wo |
|
encoder.block.23.layer.0.SelfAttention.q |
|
encoder.block.23.layer.0.SelfAttention.k |
|
encoder.block.23.layer.0.SelfAttention.v |
|
encoder.block.23.layer.0.SelfAttention.o |
|
encoder.block.23.layer.1.DenseReluDense.wi_0 |
|
encoder.block.23.layer.1.DenseReluDense.wi_1 |
|
encoder.block.23.layer.1.DenseReluDense.wo |
|
Done. |