File size: 441 Bytes
46c3dec
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
---
language: en
tags:
- pytorch
- gpt2
- language-model
pipeline_tag: text-generation
---

# GPT-X Model

This model was trained using the GPT-X framework. 

## Model Architecture

- Layers: 12
- Attention Heads: 12
- Hidden Size: 768
- Vocabulary Size: 50257
- Maximum Sequence Length: 1024
- Model Type: base

## Training Details

- Batch Size: 524288
- Learning Rate: 0.0006
- Weight Decay: 0.0
- Mixed Precision: True
- Optimizer: muon