Weyaxi
/

Neural-una-cybertron-7b

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

Edit model card

Neural-una-cybertron-7b

Neural-una-cybertron-7b is an fblgit/una-cybertron-7b-v2-bf16 model that has been further fine-tuned with Direct Preference Optimization (DPO) using the Intel/orca_dpo_pairs dataset.

This model was created after examining the procedure of mlabonne/NeuralHermes-2.5-Mistral-7B model. Special thanks to @mlabonne.

Addionatal Information

This model was fine-tuned on Nvidia A100-SXM4-40GB GPU.

The total training time was 1 hour and 10 minutes.

Prompt Template(s)

ChatML

<|im_start|>system
{system}<|im_end|>
<|im_start|>user
{user}<|im_end|>
<|im_start|>assistant
{asistant}<|im_end|>

Training hyperparameters

LoRA:

r=16
lora_alpha=16
lora_dropout=0.05
bias="none"
task_type="CAUSAL_LM"
target_modules=['k_proj', 'gate_proj', 'v_proj', 'up_proj', 'q_proj', 'o_proj', 'down_proj']

Training arguments:

per_device_train_batch_size=4
gradient_accumulation_steps=4
gradient_checkpointing=True
learning_rate=5e-5
lr_scheduler_type="cosine"
max_steps=200
optim="paged_adamw_32bit"
warmup_steps=100

DPOTrainer:

beta=0.1
max_prompt_length=1024
max_length=1536

Downloads last month: 2,006

Safetensors

Model size

7.24B params

Tensor type

BF16

·

Dataset used to train Weyaxi/Neural-una-cybertron-7b