Failure to load using DeepSpeed

#27
by Hazarth - opened

I am unable to get this model running with latest oobabooga and DeepSpeed. Without DeepSpeed it loads correctly. But with DeepSpeed enabled, I get an error telling me, that the .safetensors file is missing it's metadata.

I'm running latest oobabooga with latests transformers package on Ubuntu.
The specific error happens here in the transformers package: /lib/python3.9/site-packages/transformers/modeling_utils.py on line 432

This is the relevant code bit:

if checkpoint_file.endswith(".safetensors") and is_safetensors_available():
        # Check format of the archive
        with safe_open(checkpoint_file, framework="pt") as f:
            metadata = f.metadata()
        if metadata.get("format") not in ["pt", "tf", "flax"]:
            raise OSError(
                f"The safetensors archive passed at {checkpoint_file} does not contain the valid metadata. Make sure "
                "you save your model with the `save_pretrained` method."
            )
        elif metadata["format"] != "pt":
            raise NotImplementedError(
                f"Conversion from a {metadata['format']} safetensors archive to PyTorch is not implemented yet."
            )
        return safe_load_file(checkpoint_file)

I tried hacking the format in and managed to get pass this point, but got a different error then which I think might be due all the other metadata missing anyway.

Apparently safetensors can have a metadata header and the only supported format while using with deepspeed would be PyTorch's "pt"

I'm not yet sure, why this is the path the code takes when it loads without issues without deepspeed. Perhaps Oobabooga does some alternative conversions without it? Not sure.

However it might be useful for some, including me, if you also kept the original .pt model that was deleted in this commit: https://huggingface.co/anon8231489123/vicuna-13b-GPTQ-4bit-128g/commit/d904e5060f4b6273210f48e57601dc8933653162

Maybe place it into a separate directory, so the root folder can still serve as the default setup when using git clone?

Sign up or log in to comment