Thank you for good model.

#1
by helpyouifican - opened

Hi, I'm using the model you made, and I'm finding it works well. But I have some questions.

  1. I'm trying to reproduce the distillation process. How did you extract the jina v3 safetensor that only includes the text-matching LoRA?

  2. Regarding semantic search (like STS), were the results of this distilled model good enough, either empirically or based on benchmarks? (Even when using the m2v version of the LoRA-applied model directly?)

  3. Is ONNX conversion possible? Or does it function well enough on the CPU even without it?

  4. I'm trying to reproduce this distillation. Is it okay to configure the settings like this and distill?


from model2vec.distill import distill
m2v_model = distill(model_name="Lora applied jina v3", pca_dims=256, apply_zipf = True, trust_remote_code=True)

  1. I don't know how to extract the LoRA, so I tried distilling the entire model. Even after repeating the distillation several times, I get different cosine similarities each time. Is this normal? Or is it influenced by a random seed?

Hi, I'm using the model you made, and I'm finding it works well. But I have some questions.

Great to hear. :)

  1. I'm trying to reproduce the distillation process. How did you extract the jina v3 safetensor that only includes the text-matching LoRA?

I patched in the appropriate adapter_mask from lora_adaptations for each extraction, I don't have the exact code anymore, but it should be simple enough to reproduce.

  1. Regarding semantic search (like STS), were the results of this distilled model good enough, either empirically or based on benchmarks? (Even when using the m2v version of the LoRA-applied model directly?)

I didn't do any benchmarking, but I have used these quite a bit in internal projects with great success.

  1. Is ONNX conversion possible? Or does it function well enough on the CPU even without it?

I don't know if ONNX is possible, but it's blazing fast on CPU, so probably not an issue.

  1. I'm trying to reproduce this distillation. Is it okay to configure the settings like this and distill?

Apart from the patching I just used default distillation values, you should be good.

  1. I don't know how to extract the LoRA, so I tried distilling the entire model. Even after repeating the distillation several times, I get different cosine similarities each time. Is this normal? Or is it influenced by a random seed?

I noticed this as well, unsure what the cause is, and I'm guessing this is what causes the query/passage pairs not to work on distilled models. :( I suggest you ask model2vec authors.

Thank you very much!

helpyouifican changed discussion status to closed

Sign up or log in to comment