dwikitheduck's picture
Update README.md
3d3b061 verified
metadata
datasets:
  - genesist-logs
language:
  - id
  - en
license: llama3
tags:
  - text-generation
  - sft
  - llama
  - llama-3
  - unsloth

Genesist-8B-EarlyPrototype-0.4

This is an early prototype of the Genesist-8B model, fine-tuned from the Llama-3-8B-Instruct model using Supervised Fine-Tuning (SFT). It is designed to better understand and follow specific instructions in Indonesian.

Model Details

  • Base Model: Llama-3-8B-Instruct
  • Fine-tuning Method: Supervised Fine-Tuning (SFT)
  • Training Data: Approximately 45 million tokens of instruction data in Indonesian, specifically curated to improve the model's ability to follow instructions.
  • Languages: Indonesian (id), English (en)
  • License: Llama3

Training Hyperparameters

  • max_seq_length: 16385
  • per_device_train_batch_size: 2
  • gradient_accumulation_steps: 4
  • warmup_steps: 5
  • num_train_epochs: 1
  • learning_rate: 5e-5
  • logging_steps: 1
  • optim: "adamw_8bit"
  • weight_decay: 0.01
  • lr_scheduler_type: "linear"
  • seed: 3407