This is a merge of LongAlpaca-70B-lora into lizpreciatior's lzlv_70b_fp16_hf, and removing the extra row and pad token so that the vocabularies match.

There is no additional fine-tuning. The resulting model seems to not be broken... you can test whether it is truly the original model + 32K capability (use linear rope scaling 8).

ChuckMcSneed did a benchmark here, indicating 30% degradation with 8x the context length.

You could also try merging this with other models of longLORA descendency (like Aurelian).

A 6-bit EXL2 quantization is available here, and 4 -bit EXL2 here.

See this discussion for how to create merges like these.

Downloads last month
14
Safetensors
Model size
69B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for grimulkan/lzlv-longLORA-70b-rope8-32k-fp16

Quantizations
3 models