Edit model card
===================================================================================================================
Layer (type:depth-idx)                                            Output Shape              Param #
===================================================================================================================
MegaForMaskedLM                                                   [4, 2048, 50265]          --
├─MegaModel: 1-1                                                  [4, 2048, 768]            --
│    └─MegaEmbeddings: 2-1                                        [4, 2048, 768]            --
│    │    └─Embedding: 3-1                                        [4, 2048, 768]            38,603,520
│    └─ModuleList: 2-2                                            --                        --
│    │    └─MegaBlock: 3-2                                        [2048, 4, 768]            6,202,626
│    │    └─MegaBlock: 3-3                                        [2048, 4, 768]            6,202,626
│    │    └─MegaBlock: 3-4                                        [2048, 4, 768]            6,202,626
│    │    └─MegaBlock: 3-5                                        [2048, 4, 768]            6,202,626
│    │    └─MegaBlock: 3-6                                        [2048, 4, 768]            6,202,626
│    │    └─MegaBlock: 3-7                                        [2048, 4, 768]            6,202,626
│    │    └─MegaBlock: 3-8                                        [2048, 4, 768]            6,202,626
│    │    └─MegaBlock: 3-9                                        [2048, 4, 768]            6,202,626
│    │    └─MegaBlock: 3-10                                       [2048, 4, 768]            6,202,626
│    │    └─MegaBlock: 3-11                                       [2048, 4, 768]            6,202,626
│    │    └─MegaBlock: 3-12                                       [2048, 4, 768]            6,202,626
│    │    └─MegaBlock: 3-13                                       [2048, 4, 768]            6,202,626
├─Linear: 1-2                                                     [4, 2048, 50265]          38,653,785
===================================================================================================================
Total params: 151,688,817
Trainable params: 151,688,817
Non-trainable params: 0
Total mult-adds (G): 150.35
===================================================================================================================
Input size (MB): 0.07
Forward/backward pass size (MB): 10818.75
Params size (MB): 606.71
Estimated Total Size (MB): 11425.52
===================================================================================================================
Downloads last month
11