Edit model card

LargeWorldModel 7B 1000000 ctx finetuned on AEZAKMI v3.1 dataset for epochs at max_seq_len of 4000 using QLoRA with lora_r 32 and cosine lr decaying from 0.00015. I will be uploading exl2 quants and base model in safetensors format soon.

Fine-tuned with unsloth, FA2 on local RTX 3090 Ti. Training took around 6 hours. I think most of the long ctx capabilities remain.

Downloads last month
487
Safetensors
Model size
6.74B params
Tensor type
FP16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including adamo1139/LWM-7B-1M-1000000ctx-AEZAKMI-3_1-1702