Model Card for Model ID

GPT-6B_Tuned_small_pile is a GPT-j-6B model trained on 0.1 million example of pile dataset.

n_embd: 4096, n_layer: 28, n_positions: 2048

Tuning Parameters:

val_split_percent: 20,

momentum: 0.9 

train_batch_size (eff) : 32

train_micro_batch: 16

gradient_accumulation_steps: 2

gradient_clipping: 0.5

learning_rate: 0.00001

weight_decay: 0.01

lr_schedular: cosine

lr_warmup_steps: 1000

lr_decay: 0.1

lr_decay_step: 2000

mixed_precision: bf16

## Model Details

Model Description

Developed by: [More Information Needed]
Shared by [optional]: [More Information Needed]
Model type: [More Information Needed]
Language(s) (NLP): [More Information Needed]
License: [More Information Needed]
Finetuned from model: EleutherAI/gpt-j-6b

Defalt-404
/

GPT-6B_Tuned_small_pile

Model Card for Model ID

Model Description

Dataset used to train Defalt-404/GPT-6B_Tuned_small_pile