Edit model card

Model Card for Model ID

GPT-6B_Tuned_small_pile is a GPT-j-6B model trained on 0.1 million example of pile dataset.

n_embd: 4096, n_layer: 28, n_positions: 2048

Tuning Parameters:

val_split_percent: 20,

momentum: 0.9 

train_batch_size (eff) : 32

train_micro_batch: 16

gradient_accumulation_steps: 2

gradient_clipping: 0.5

learning_rate: 0.00001

weight_decay: 0.01

lr_schedular: cosine

lr_warmup_steps: 1000

lr_decay: 0.1

lr_decay_step: 2000

mixed_precision: bf16

image.png## Model Details

Model Description

  • Developed by: [More Information Needed]
  • Shared by [optional]: [More Information Needed]
  • Model type: [More Information Needed]
  • Language(s) (NLP): [More Information Needed]
  • License: [More Information Needed]
  • Finetuned from model: EleutherAI/gpt-j-6b
Downloads last month
7
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Defalt-404/GPT-6B_Tuned_small_pile