how to convert megatron model to huggingface?

#6
by cdj0311 - opened

hi,
I want convert megatron model (trained by myself with bigcode-project/Megatron-LM repo) to huggingface format, can you provide a script to convert it?

BigCode org

You can clone this repo: https://github.com/bigcode-project/transformers/ and use this code

You can clone this repo: https://github.com/bigcode-project/transformers/ and use this code

Thanks,
But this code just support 1-way tensor/pipeline parallelism model converted, how to convert when tensor/pipline parallelism > 1?

BigCode org

you need to merge the partitions with Megatron-LM before the conversion, with something like this

python Megatron-LM/tools/checkpoint_util.py \
        --model-type GPT  \
        --load-dir CKPT_DIR \
        --save-dir OUTPUT_PATH \
        --target-tensor-parallel-size 1 \
        --target-pipeline-parallel-size 1 
cdj0311 changed discussion status to closed

Sign up or log in to comment