fp16 model?

by nonetrix - opened Mar 29

Discussion

nonetrix

Mar 29

gguf is nice but I would like a fp16 model for merging

kamiao

23 days ago

fp16 or other q8 q6 please?

KatyTheCutie

Katy's Historical Models org 23 days ago

I unfortunately don't have the compute to make this happen, I'll upload the yaml file of the merge if anyone wants to recreate it.

nonetrix

23 days ago

•

edited 23 days ago

It's just a merge? I have 64GBs of RAM to spare and AMD GPU with a extra 16GBs (AMD so unstable as shit). I can do that easily

Suparious

23 days ago

I would quant an AWQ of this if I we could see the model.
I thought that you made a GGUF from the model, but reading this suggests that the GGUF is all you have for this. I didn't know you could model merge without fp16.

KatyTheCutie

Katy's Historical Models org 23 days ago

I did it on the kobold merge box a while back, thats why I didnt have access to the FP16 files to upload, it is just a merge, sorry if I have mislead you into thinking it was a finetune or something.

Suparious

23 days ago

•

edited 23 days ago

I'm reading this article about merging: https://huggingface.co/blog/mlabonne/merge-models
do you remember which method you had used?
never merged models before, but I have compute available to me

edit0: nvm I didn't read merge_method: task_arithmetic
no clue how to do that one, but I'll research it
edit1: this should do it: https://github.com/arcee-ai/mergekit/blob/main/notebook.ipynb

saishf

23 days ago

I'm reading this article about merging: https://huggingface.co/blog/mlabonne/merge-models
do you remember which method you had used?
never merged models before, but I have compute available to me

edit0: nvm I didn't read merge_method: task_arithmetic
no clue how to do that one, but I'll research it
edit1: this should do it: https://github.com/arcee-ai/mergekit/blob/main/notebook.ipynb

Merging locally would be much faster, free colab gives you two meh cores. And little ram, though you can offload to vram with --read-to-gpu.
You also need storage to store all the fp16 files. 15gb~ per 7b and the resulting model. Which would be hard to fit in the default 70gb.
Paid tiers could be better though.

Suparious

23 days ago

@saishf you are 100% correct, I am using Jupyter notebook locally with some nvidia GPUs. Same thing as colab, but just runs local.

Suparious

23 days ago

•

edited 23 days ago

actually just installing mergekit, and not using a jupyter (colab) at all is optimal:

not sure if this is the correct way, but it's running now: mergekit-yaml /opt/solidrust/merges/KatyTestHistorical-SultrySilicon-7B-V2.yaml . --cuda
some of the models are gated and need you to go to them and acknowleged the things

saishf

23 days ago

actually just installing mergekit, and not using a jupyter (colab) at all is optimal:

not sure if this is the correct way, but it's running now: mergekit-yaml /opt/openbet/inference/KatyTestHistorical-SultrySilicon-7B-V2.yaml . --cuda

Jupyter is cool, I don't have much use though as I only have a single gpu.
I first learnt about mergekit here https://huggingface.co/blog/mlabonne/merge-models
It's nice and easy for a start, but doesn't go into detail with the latest methods but it works. Plus --help will detail everything for you!
--low-cpu-ram is useful too, if you have more vram than ram
One I use every time is --out-shard-size "-B"
2B makes each shard like 4GB

Suparious

23 days ago

Thank-you.
used mergekit-yaml /opt/openbet/inference/KatyTestHistorical-SultrySilicon-7B-V2.yaml . --cuda --low-cpu-memory --out-shard-size "2B"

and created: https://huggingface.co/solidrust/KatyTestHistorical-SultrySilicon-7B-V2

saishf

23 days ago

Thank-you.
used mergekit-yaml /opt/openbet/inference/KatyTestHistorical-SultrySilicon-7B-V2.yaml . --cuda --low-cpu-memory --out-shard-size "2B"

and created: https://huggingface.co/solidrust/KatyTestHistorical-SultrySilicon-7B-V2

Hopefully we will see more quants with fp16 files available! Most models get ggufs within a day thanks to all the dedicated people on hf, even exl and awq for, people...

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment