please a 13 b version

#2
by IkariDev - opened

please a 13 b version

If I figure out how to get some A100 GPUs :) I tried training the 13B version on the A10G but ran out of GPU memory. Might look around on vasti.ai runpod.io or something for those A100s...

I was able to train 13B model on two RTX 3090:

{'train_runtime': 95229.7197, 'train_samples_per_second': 0.363, 'train_steps_per_second': 0.091, 'train_loss': 0.5828390517308127, 'epoch': 1.0}
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 8649/8649 [26:27:09<00:00, 11.01s/it]

Not to be greedy, but even better an uncensored version fine tuned on the LLongMA-2-13b version would be even better. ;)
https://huggingface.co/conceptofmind/LLongMA-2-13b

It should be mentioned, that these uncensored models still somehow have some safety bias from llama2. When used for story telling in a dark fantasy setting for example, they will still resist well established social mores of the setting, generating out of prompt context replies. As much as I'd love a 13B model of llama2 chat uncensored, it might take some time to work out the ideal datasets.

@jriskin what the difference between LLongMA-2-13b and LLama-2-13b?

@georgesung I can't imagine how you managed to finetune 7B model on a single 24Gb GPU...
I tried, but got OOM, so i had to run on two and after one hour it consumed more than 27Gb VRAM:
image.png

@arogov Did you use QLoRA for fine-tuning? You can reproduce my model like this:

git clone https://github.com/georgesung/llm_qlora
cd llm_qlora
pip install -r requirements.txt
python train.py configs/llama2_7b_chat_uncensored.yaml

Yes, but with few code improvements

Sign up or log in to comment