File size: 441 Bytes
46c3dec |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
---
language: en
tags:
- pytorch
- gpt2
- language-model
pipeline_tag: text-generation
---
# GPT-X Model
This model was trained using the GPT-X framework.
## Model Architecture
- Layers: 12
- Attention Heads: 12
- Hidden Size: 768
- Vocabulary Size: 50257
- Maximum Sequence Length: 1024
- Model Type: base
## Training Details
- Batch Size: 524288
- Learning Rate: 0.0006
- Weight Decay: 0.0
- Mixed Precision: True
- Optimizer: muon
|