Is this still the recommended way of sharing custom models?

#1
by sam-mosaic - opened

The docs say "This API is experimental and may have some slight breaking changes in the next releases" β€” Is this still the right way to share a model that requires custom code?

If one of the Python files depended on a library which needs to be pip installed... does the user just need to make sure to do that pip installation before calling model = AutoModelForSomething.from_pretrained("custom-model-that-has-pip-dependency")?

Yes, this is still the recommended way, and yes any dependency should be installed before calling the auto model API (note that you will need to add trust_remote_code=True. Normally, the API scans for imports and will raise a helpful error message if a dependency is missing.

I tried to write a script based on https://huggingface.co/docs/transformers/custom_models which pulls a checkpoint from s3, imports the needed custom code, instantiates of HF wrapper model, then saves it so I can push it to the hub. However, after saving it, it won't load from disk because of import errors. Are there any other code samples for generating custom models?

I'm afraid we can't really help without seeing any code. Do you have a reproducer?

Thanks for offering to look at code, I didn't want to be annoying and was just hoping that I'd missed some docs.

I have a directory like this

.
β”œβ”€β”€ configuration_my_model.py
β”œβ”€β”€ flash_attention.py
β”œβ”€β”€ flash_attn_triton.py
β”œβ”€β”€ model-yamls
β”‚   β”œβ”€β”€ checkpoint-and-config
β”‚   β”‚   β”œβ”€β”€ config.yaml
β”‚   β”‚   └── ep0-ba4800-rank0.pt
β”‚   └── another-checkpoint-and-config
β”‚       β”œβ”€β”€ config.yaml
β”‚       └── ep0-ba24000-rank0.pt
β”œβ”€β”€ modeling.py
└── save_pretrained.py

where save_pretrained.py loads the pytorch checkpoint and yaml file, loads the modeling code from modeling.py (which imports from the configuration and flash attention files):

import glob
import os

import torch
from omegaconf import OmegaConf as om

import modeling


def model_name(yaml_name: str) -> str:
    # generate name from yaml_name
    return yaml_name


if __name__ == "__main__":
    model_yamls_dir = 'model-yamls'

    # use glob to find all the yaml files in the model-yamls directories
    yaml_paths = glob.glob(os.path.join(model_yamls_dir, '**/*.yaml'), recursive=True)

    for yaml_path in yaml_paths:
        cfg = om.load(yaml_path)
        cfg = cfg.parameters.model

        model_dir = os.path.dirname(yaml_path)
        pt_paths = glob.glob(os.path.join(model_dir, '*.pt'))
        # there should only be one pt file, so use the first one
        pt_path = pt_paths[0]

        checkpoint = torch.load(pt_path)
        cm = modeling.MyModel(cfg)
        cm.model.load_state_dict(checkpoint['state']['model'])
        model = cm.model

        # this isn't working, there are always import errors
        modeling.MyModelConfig.register_for_auto_class()
        modeling.MyModel.register_for_auto_class("AutoModelForCausalLM")

        save_path = model_name(yaml_name=yaml_path)

        model.save_pretrained(save_path)

but either this gives

from .configuration_my_model import MyModelConfig
ImportError: attempted relative import with no known parent package

if I use the . before the module to import... or if I don't, then the files don't get copied into the huggingface model directory, since without the . prefix it thinks the module I am importing is a pip package.

So my question is, how does one get imports right so that they get copied in to the created directory and can be loaded with .from_pretrained? Thank you

Sign up or log in to comment