Edit model card

Description

Exllama 2 quant of NeverSleep/Nethena-20B

3 BPW, Head bit set to 8

Prompt template: Alpaca

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{prompt}

### Response:

VRAM

My VRAM usage with 20B models are:

Bits per weight Context VRAM
6bpw 4k 24gb
4bpw 4k 18gb
4bpw 8k 24gb
3bpw 4k 16gb
3bpw 8k 21gb
I have rounded up, these arent exact numbers, this is also on a windows machine.
Downloads last month
12
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.