Model Card for Model ID
GPT-6B_Tuned_small_pile is a GPT-j-6B model trained on 0.1 million example of pile dataset.
n_embd: 4096, n_layer: 28, n_positions: 2048
Tuning Parameters:
val_split_percent: 20,
momentum: 0.9
train_batch_size (eff) : 32
train_micro_batch: 16
gradient_accumulation_steps: 2
gradient_clipping: 0.5
learning_rate: 0.00001
weight_decay: 0.01
lr_schedular: cosine
lr_warmup_steps: 1000
lr_decay: 0.1
lr_decay_step: 2000
mixed_precision: bf16
Model Description
- Developed by: [More Information Needed]
- Shared by [optional]: [More Information Needed]
- Model type: [More Information Needed]
- Language(s) (NLP): [More Information Needed]
- License: [More Information Needed]
- Finetuned from model: EleutherAI/gpt-j-6b
- Downloads last month
- 11
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.