Axolotl configuration:


base_model: cognitivecomputations/dolphin-2.9.4-llama3.1-8b
model_type: LlamaForCausalLM
tokenizer_type: AutoTokenizer
tokenizer:
  name_or_path: "https://huggingface.co/cognitivecomputations/dolphin-2.9.4-llama3.1-8b/resolve/main/tokenizer.json"


load_in_8bit: false
load_in_4bit: true
strict: false
save_safetensors: true
bnb_4bit_quant_type: "nf4"
bnb_4bit_compute_dtype: "bf16"
bnb_4bit_use_double_quant: true

rl: dpo
chat_template: chatml
datasets:
  - path: mlabonne/orpo-dpo-mix-40k-flat
    split: train
    type: chatml.intel

dataset_prepared_path: /workspace/axolotl/dataset-prepared
val_set_size: 0.0
output_dir: ./out

adapter: qlora
lora_model_dir:

sequence_len: 2048
sample_packing: false
pad_to_sequence_len: false

lora_r: 64
lora_alpha: 32
lora_dropout: 0.05
lora_target_linear: true
lora_fan_in_fan_out:
lora_target_modules:

wandb_project: axolotl
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:


gradient_accumulation_steps: 4  # Reduced from 8 to 4 due to large VRAM
micro_batch_size: 2  # Increased micro-batch size to 2
num_epochs: 1
optimizer: paged_adamw_8bit
lr_scheduler: cosine
learning_rate: 5e-6
train_on_inputs: false
group_by_length: false

bf16: true  # Use bf16 as it is optimal for A40 GPUs
fp16: false
tf32: true  # TF32 is supported by A40 and improves performance

gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true
warmup_steps: 100
evals_per_epoch: 0
eval_table_size:
eval_table_max_new_tokens: 128
saves_per_epoch: 1
debug:
deepspeed: deepspeed_configs/zero2.json  # Enable DeepSpeed with ZeRO Stage 2
weight_decay: 0.0
special_tokens:
  pad_token: <|end_of_text|>
Downloads last month
19
Safetensors
Model size
8.03B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for CultriX/Lama-DPOlphin-8B

Finetuned
(5)
this model
Quantizations
10 models

Dataset used to train CultriX/Lama-DPOlphin-8B

Collection including CultriX/Lama-DPOlphin-8B