NuExtract-large 7b and NuExtract 3.8B have same size model file

#4
by mohit0928 - opened

The size of model.safetensors is the same for both NuExtract-large 7b and NuExtract 3.8B. Since the NuExtract-large has nearly twice the number of parameters, shouldn't the file sizes be different? I'm checking this to verify whether these two are the same model or different models.

No, it's not the same model, this one is base on phi3-small. The difference come from the fact that you need to save/load the weight in bf16 (mostly because we couln't make the training in full precision because of flash-attention).

Alexandre-Numind changed discussion status to closed

Ohh I see, you have also updated the model loading script as well
model = AutoModelForCausalLM.from_pretrained("numind/NuExtract", trust_remote_code=True, torch_dtype=torch.bfloat16)
I checked 10 hrs ago, it was
model = AutoModelForCausalLM.from_pretrained("numind/NuExtract", trust_remote_code=True)
so I was a bit confused it's same for both NuExtract-large 7b and NuExtract 3.8B

Thanks for clarifying @Alexandre-Numind

Sign up or log in to comment