Possible Issues - Testing

#19

by SolidSnacke - opened Apr 20

Apr 20

Maybe add here a check for the presence of a ready-made gguf-f16 file too?
(line 75) def download_model_repo()

Because if you delete the downloaded model files (folder) and cache, and for example for some reason the creation of the imatrix.dat file is interrupted, but the gguf-f16 file already exists, then the script will start downloading the model files again. There is no check for the presence of an already compressed file.

SolidSnacke

Apr 20

Something else.
Immediately after creating imatrix.dat, when executing this code, for some reason it gave an error that this file already exists.

(line 143) shutil.move(os.path.join(gguf_dir, "imatrix.dat"), gguf_dir)

But if you run the script again, there is no error and quantization starts. Perhaps this was an isolated incident, but I’m just writing just in case.

FantasiaFoundry

Owner Apr 21

•

edited Apr 21

Maybe add here a check for the presence of a ready-made gguf-f16 file too?

Oh. I thought this was already handled. Um, I'll need to test again later.

Immediately after creating imatrix.dat, when executing this code, for some reason it gave an error that this file already exists.

Was it the llama-3 script?

Lewdiculous

Apr 21

•

edited Apr 21

I'm assuming it was the regular script, that is handled in the llama-3 script but I might have forgotten about the regular script. Oops. Seems like it, they should be pretty much the same anyways.

FantasiaFoundry

Owner Apr 21

•

edited Apr 21

@SolidSnacke Give it a try now, please. Both issues should be handled.

FantasiaFoundry changed discussion title from Addition to Possible Issues - Testing Apr 21

SolidSnacke

Apr 21

I'll try to check.

@Lewdiculous I was a little surprised that you found my post on Reddit.

SolidSnacke

Apr 21

•

edited Apr 21

I looked at the code. This function does not check for the existence of a ready-made GGUF-F16 file. (line 70) def download_model_repo()
I mean, it only checks for the presence of a folder with the downloaded model.
If, for example, the user wrote 'yes' here: (line 83, line 92) delete_model_dir = input("Remove HF model folder after converting original model to GGUF? (yes/no) (default: no): ").strip().lower()
Then the folder with the model will be deleted after creating the GGUF-F16 file.
And if at the same time he deletes the .cache folder, then when he runs the script again (if he wants, for example, to make another quantized model), the script will start downloading the repository with the model again, ignoring the presence of a ready-made GGUF-F16.
I suggest adding a check for the presence of at least a folder with GGUF-F16 in that place before downloading the repository.
That is, add the line - (line 104) gguf_dir = os.path.join(base_dir, "models", f"{model_name}-GGUF")
gguf_model_path = os.path.join(gguf_dir, f"{model_name}-F16.gguf")
Аfter the line - (line 72) models_dir = os.path.join(base_dir, "models")
And add a check: if os.path.exists(gguf_model_path)
And if this file already exists, then you can immediately move on to the next function.
Maybe you did the check differently, but to be honest, I didn’t see this problem changing anywhere. I mean, I look at the commits.
I just saw the changes from imatrix.dat, but I haven't checked that place yet so I can't say anything.

Lewdiculous

Apr 21

I guess I was testing something else then. Yeah... I'll try again next in my next idle times.

SolidSnacke

Apr 21

Fine

SolidSnacke

Apr 21

Is it possible to make a pull request? I would just like to show my version of gguf-imat.py.

Lewdiculous

Apr 21

Ah, yeah of course, make a PR, it's always welcome. This should be simple enough to not have issues.

SolidSnacke

Apr 21

Okay, I just don't want to accidentally upload garbage here.
I'll try to do it now.

Lewdiculous

Apr 21

•

edited Apr 21

It's fine, don't worry. This was mostly for personal use and I shared eventually since I saw people wanting to make their own quants following TheBloke's absence, so I decided to share. It's still dirty and unpolished by nature. It just has to make the processes less annoying, that's how it came to be.

SolidSnacke

Apr 21

Perhaps I'm doing something wrong, but when I try to submit corrections I get an error.
remote: Password authentication in git is no longer supported. You must use a user access token or an SSH key instead. See https://huggingface.co/blog/password-git-deprecation
fatal: Authentication failed for 'https://huggingface.co/FantasiaFoundry/GGUF-Quantization-Script/'

Lewdiculous

Apr 21

•

edited Apr 21

This simple enough so just use the web interface to submit your changes. It's fine. You can do both scripts since it's the same thing.

SolidSnacke

Apr 21

Okay, I'll try

Lewdiculous

Apr 21

•

edited Apr 21

No rush, will check in the afternoon or evening when I sit down. But if you validated your tests it should be good, thanks for bringing it up.

SolidSnacke

Apr 21

•

edited Apr 21

I was able to do it anyway. Not a very familiar system compared to GitHub. Although yes, it's probably much safer, maybe... I don't know. (was involved in SillyTavern interface translation, that's why I'm used to the system in GitHub)

SolidSnacke

Apr 21

•

edited Apr 21

No rush, will check in the afternoon or evening when I sit down. But if you validated your tests it should be good, thanks for bringing it up.

I just once wanted to figure out how to create these models myself, but I wasn’t smart enough. I'm lucky that you posted this script. I am very grateful to you for this and wanted to do at least something in return.

Lewdiculous

Apr 22

Thanks mate.

FantasiaFoundry

Owner Apr 22

Closed by PR#24.

FantasiaFoundry changed discussion status to closed Apr 22

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment