Edit model card

LargeWorldModel 7B 1000000 ctx finetuned on AEZAKMI v3.1 dataset for epochs at max_seq_len of 4000 using QLoRA with lora_r 32 and cosine lr decaying from 0.00015. I will be uploading exl2 quants and base model in safetensors format soon.

Fine-tuned with unsloth, FA2 on local RTX 3090 Ti. Training took around 6 hours. I think most of the long ctx capabilities remain.

Downloads last month
2,723
Safetensors
Model size
6.74B params
Tensor type
FP16
·

Collection including adamo1139/LWM-7B-1M-1000000ctx-AEZAKMI-3_1-1702