ise-uiuc/Magicoder-S-DS-6.7B · What is DS, CL, S-DS and S-CL

Jan 30, 2024

What are these things and how are models affected by them, which suits best for function calling with what prompt?

Mar 12, 2024

As refer to their paper, I think that DS means that model is fined tune from DeepSeek base model, and CL is fined tune from CodeLLama base model

LuMatic

Mar 25, 2024

yeah, CL and DS are quite clear. But what "S-" stand for? ;)

If I understand this tweet (xeet?) correctly, it's a Model additionally trained on Magicoder-Evol-Instruct-110K data set?

So:

Magicoder-CL-7B = CodeLlama 7b fine-tuned on Magicoder-OSS-Instruct-75K data set
Magicoder-S-CL-7B = CodeLlama7b fine-tuned on ise-uiuc/Magicoder-OSS-Instruct-75K and Magicoder-Evol-Instruct-110K data sets?
Magicoder-CL-7B = DeepSeek 7b fine-tuned on Magicoder-OSS-Instruct-75K data set
Magicoder-S-CL-7B = DeepSeek fine-tuned on ise-uiuc/Magicoder-OSS-Instruct-75K and Magicoder-Evol-Instruct-110K data sets?

Is that correct? I may add this to Docs if you want.

Can you also confirm the minimum V-RAM required to run specific models?

I tried to load the Gradio demo but got CUDA out of memory. Tried to allocate 32.00 MiB.- so it requires much more than I was expecting :D

LuMatic

Mar 25, 2024

•

edited Mar 25, 2024

Sorry, my bad... It's already in README... :P
https://github.com/ise-uiuc/magicoder/blob/main/README.md#-dataset

Magicoder-OSS-Instruct-75K: generated through OSS-Instruct using gpt-3.5-turbo-1106 and used to train both Magicoder and Magicoder-S series.
Magicoder-Evol-Instruct-110K: decontaminated and redistributed from theblackcat102/evol-codealpaca-v1, used to further finetune Magicoder series and obtain Magicoder-S models.

But I would add it to the Models table for total clarity (Base Model, Data Sets, and minimum V-RAM required).