Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
etemiz 
posted an update 5 days ago
Post
1649
Started fine tuning Gemma 3 using evolutionary approach. It is not the worst model according to AHA leaderboard and it is one of the smart according to lmarena.ai. My objective is to make it based, anti woke, wise, beneficial and then some.

Several GPUs are fine tuning it at the same time, each using a different dataset and using QLoRA and the successful ones are merged later. Compared to LoRa this allows faster training and also reduced overfitting because the merge operation heals overfitting. The problem with this could be the 4 bit quantization may make models dumber. But I am not looking for sheer IQ. Too much mind is a problem anyway :)

Has anyone tried parallel QLoRa and merge before?

I also automated the dataset selection and benchmarking and converging to objectives (the fit function, the reward). It is basically trying to get higher score in AHA Leaderboard as fast as possible with a diverse set of organisms that "evolve by training".

I want to release some cool stuff when I have the time:
- how an answer to a single question changes over time, with each training round or day
- a chart to show AHA alignment over training rounds
deleted
This comment has been hidden

Your approach to fine-tuning Gemma 3 sounds really interesting and innovative! Using evolutionary techniques along with QLoRA for parallel training and merging seems like a solid strategy to both speed up the process and reduce overfitting. I totally agree with your point on not just focusing on sheer IQ; balance is key, and wisdom matters just as much.

I haven’t tried parallel QLoRA and merging before, but it sounds like it could be a game-changer. I'd be curious to see the impact on both model performance and AHA alignment over time—those charts you mentioned could be really insightful for tracking progress. Also, if anyone’s interested, I’ve come across bt business contact number https://www.pissedconsumer.com/company/bt-business-direct/customer-service.html while looking into relevant tools for business data, and it could be useful when you’re dealing with anything related to professional collaboration or resources.

Looks like we need more mature tools for Gemma 3, it is failing to fine tune like half of the time. Unsloth and transformers are getting ready. And I am trying lower learning rates and rank stabilized LoRa, and different r, lora_alpha.

In this post