Edit model card

Genesist-8B-EarlyPrototype-0.4 GGUF

This is an early prototype of the Genesist-8B model, fine-tuned from the Llama-3-8B-Instruct model using Supervised Fine-Tuning (SFT). It is designed to better understand and follow specific instructions in Indonesian.

Model Details

  • Base Model: Llama-3-8B-Instruct
  • Fine-tuning Method: Supervised Fine-Tuning (SFT)
  • Training Data: Approximately 45 million tokens of instruction data in Indonesian, specifically curated to improve the model's ability to follow instructions.
  • Languages: Indonesian (id), English (en)
  • License: Llama3

Training Hyperparameters

  • max_seq_length: 16385
  • per_device_train_batch_size: 2
  • gradient_accumulation_steps: 4
  • warmup_steps: 5
  • num_train_epochs: 1
  • learning_rate: 5e-5
  • logging_steps: 1
  • optim: "adamw_8bit"
  • weight_decay: 0.01
  • lr_scheduler_type: "linear"
  • seed: 3407
Downloads last month
6,148
GGUF
Model size
8.03B params
Architecture
llama