Im releasing a very capable Gemma 7b model, it might be worth your time to fine tune it instead the base version.

#2
by rombodawg - opened

@aaabiao I just finished creating a model i call EveryoneLLM-7b-Gemma-Base, and I will be uploading the weights overnight. Its an extremely capable model that combines the aspects of many existing models. It does well in many various fields including coding. I think this mode would be perfect for you to fine-tune over the base gemma-7b model. I just wanted to give you a heads up in case you are interested.

Here is where the model files will be once there done uploading

https://huggingface.co/rombodawg/EveryoneLLM-7b-Gemma-Base

Multimodal Art Projection org

Hi @rombodawg ,

Thank you for reaching out and sharing your work on the EveryoneLLM-7b-Gemma-Base model. It's great to hear about the capabilities and the diverse areas your model excels in, including coding. We appreciate your suggestion to fine-tune over the base Gemma-7b model.

We have indeed conducted some private experiments with the sft Gemma-7b base, but unfortunately, the benchmark scores did not meet our expectations. Specifically, our results were 0.622 on Humaneval and 0.446 on MBPP, leading us not to release a final version publicly at this time.

Additionally, our internal resources have been allocated to other projects of higher priority, which means that the OpenCodeInterpreter-GM project has been deprioritized. As a result, we will have to delay any retraining efforts, and we apologize for any inconvenience this may cause.

Nevertheless, we are thankful for your initiative and would be interested in knowing the benchmark scores for your model. Should our resources allow in the future, we would consider using your uploaded model as a base for further enhancing coding capabilities through specialized fine-tuning.

Thank you again for your understanding and for considering us in your work. We look forward to the possibility of collaborating in the future.

You have to check this out @aaabiao Massive improvements for gemma finetuning because of these findings

https://www.reddit.com/r/LocalLLaMA/comments/1bd18y8/gemma_finetuning_should_be_much_better_now/

Ive opened an official issue for transformers to implement a fix
https://github.com/huggingface/transformers/issues/29616

Multimodal Art Projection org

Thanks for sharing this! I'll definitely check it out.

@aaabiao I would like to share that we at Replete-Ai have created a new model called Mistral-11b-v0.1 which is an expanding on the size and pretraining on the mistral-7b model. Feel free to check it out. I would love to see a coding variant if your team is at all interested.

https://huggingface.co/Replete-AI/Mistral-11b-v0.1

Sign up or log in to comment