Text Generation
Transformers
English
llama
Inference Endpoints
text-generation-inference

Much love for the train.

#1
by deleted - opened
deleted

Thanks for the work. I'm stress testing now, posting recommendations over on the dataset thread.

deleted

Hey, quick follow up, if you get a chance to do a Windows compatible GPTQ quant for this, that'd really help me with prompt testing. I think it's probably good enough to bother with.

Just leave out --act-order.

CUDA_VISIBLE_DEVICES=0 python llama.py ../../models/PathToVicuna --true-sequential --wbits 4 --groupsize 128 --save_safetensors vicuna-13b-free-4bit-128g.safetensors

I think that's the command (someone feel free to correct me, I don't feel like checking). Only takes like an hour.

deleted

Oh, and not to spam you, but if you're planning to train again as soon as we get a V4 for the dataset, feel free ignore this request.

deleted

Grabbing the safetensors now. Appreciate it.

deleted changed discussion status to closed

Sign up or log in to comment