Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Built with Axolotl

See axolotl config

axolotl version: 0.3.0

base_model: justinj92/phi2-platypus
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer
is_llama_derived_model: false
trust_remote_code: true

load_in_8bit: false
load_in_4bit: false
strict: false

rl: true
datasets:
  - path: Intel/orca_dpo_pairs
    split: train
    type: intel_apply_chatml
  - path: argilla/ultrafeedback-binarized-preferences
    split: train
    type: argilla_apply_chatml
dataset_prepared_path: ./dpoplatypus-phi2/last_run_prepared
val_set_size: 0.0
output_dir: ./dpoplatypus-phi2/
#'Wqkv', 'out_proj', 'fc2', 'linear', 'fc1'
adapter:
sequence_len: 2048
sample_packing: false
pad_to_sequence_len:

lora_r: 64
lora_alpha: 32
lora_dropout: 0.05
lora_target_linear: true
lora_fan_in_fan_out:
lora_modules_to_save:
  - embd
  - lm_head
hub_model_id: justinj92/phi2-platypus-dpo


wandb_project: phi2-platypus-dpo
wandb_entity: justinjoy-5
wandb_watch:
wandb_run_id:
wandb_log_model:

gradient_accumulation_steps: 4
micro_batch_size: 1
num_epochs: 1
optimizer: adamw_torch
adam_beta2: 0.95
adam_epsilion: 0.00001
lr_scheduler: cosine
max_grad_norm: 1.0
learning_rate: 0.00002

train_on_inputs: false
group_by_length: false
bf16: true
fp16: false
tf32: true

gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 100
eval_steps:
evals_per_epoch: 4
saves_per_epoch: 2
eval_table_size:
debug:
deepspeed:
weight_decay: 0.1
fsdp:
fsdp_config:
resize_token_embeddings_to_32x: true
special_tokens:
  pad_token: "<|endoftext|>"

dpoplatypus-phi2

This model is a fine-tuned version of justinj92/phi2-platypus on the None dataset.

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • training_steps: 19120

Training results

Framework versions

  • Transformers 4.37.0.dev0
  • Pytorch 2.0.1+cu117
  • Datasets 2.16.1
  • Tokenizers 0.15.0
Downloads last month
0
Safetensors
Model size
2.78B params
Tensor type
F32
Β·
BF16
Β·

Quantized from

Datasets used to train justinj92/dpoplatypus-phi2

Spaces using justinj92/dpoplatypus-phi2 3