๐ฅ โKey Innovations: 1๏ธโฃ First to adapt SD for โdirect textured mesh generation (1-2s inference) 2๏ธโฃ Novel teacher-student framework leveraging multi-view diffusion models ([MVDream](https://arxiv.org/abs/2308.16512) & [RichDreamer](https://arxiv.org/abs/2311.16918)) 3๏ธโฃ โParameter-efficient tuning - โonly +2.6% params over base SD 4๏ธโฃ โ3D data-free training liberates model from dataset constraints
๐ก Why matters? โ A novel โ3D-Data-Free paradigm โ Outperforms data-driven methods on creative concept generation โ Unlocks web-scale text corpus for 3D content creation
AReal-Boba ๐ฅ a fully open RL Frameworks released by AntGroup, an affiliate company of Alibaba. inclusionAI/areal-boba-67e9f3fa5aeb74b76dcf5f0a โจ 7B/32B - Apache2.0 โจ Outperform on math reasoning โจ Replicating QwQ-32B with 200 data under $200 โจ All-in-one: weights, datasets, code & tech report
1 reply
ยท
reacted to MonsterMMORPG's
post with ๐คฏ๐ค๐๐ง โ๐๐คโค๏ธ๐๐๐ฅ20 days ago
I have Compared Kohya vs OneTrainer for FLUX Dev Finetuning / DreamBooth Training
OneTrainer can train FLUX Dev with Text-Encoders unlike Kohya so I wanted to try it.
Unfortunately, the developer doesn't want to add feature to save trained Clip L or T5 XXL as safetensors or merge them into output so basically they are useless without so much extra effort.
I still went ahead and wanted to test EMA training. EMA normally improves quality significantly in SD 1.5 training. With FLUX I have to use CPU for EMA and it was really slow but i wanted to test.
I have tried to replicate Kohya config. The below you will see results. Sadly the quality is nothing sort of. More research has to be made and since we still don't get text-encoder training due to developer decision, I don't see any benefit of using OneTrainer for FLUX training instead of using Koha.
2nd image : One Trainer Kohya config with EMA update every 1 step
3rd image : One Trainer Kohya config with EMA update every 5 steps
4th image : One Trainer Kohya config
5th image : One Trainer Kohya config but Timestep Shift is 1 instead of 3.1582
I am guessing that Timestep Shift of OneTrainer is not same as Discrete Flow Shift of Kohya
Probably I need to work and do more test and i can improve results but i don't see any reason to do atm. If Clip Training + merging it into safetensors file was working, I was gonna pursue it
These are not cherry pick results all are from 1st test grid
Last year, I curated & generated a few multilingual SFT and DPO datasets by translating English SFT/DPO datasets into 9-10 languages using the mistralai/Mistral-7B-Instruct-v0.2 model.
I hope it helps the community for pretraining/instruction tuning multilingual LLMs! I added a small diagram to briefly describe which datasets are added and their sources.
Happy to collaborate in either using these datasets for instruction FT, or wishes to extend translated versions of newer SFT/DPO english datasets!
๐ DEEPSEEK R1โฆ Replicated! ๐ง โจ All powered by just ONE system prompt. Try it. Compare it. See for yourself. ๐ ๐ฅ Even better than the original โ with richer, more insightful replies. ๐ฏ No gimmicks. Just pure AI performance.