40 42 67

ChunTe Lee

Chunte

AI & ML interests

None yet

Recent Activity

reacted to merterbak's post with 👀 12 days ago

Qwen 3 can launch very soon. 👀 https://github.com/ggml-org/llama.cpp/pull/12828

reacted to merterbak's post with 🔥 12 days ago

Qwen 3 can launch very soon. 👀 https://github.com/ggml-org/llama.cpp/pull/12828

reacted to ZhiyuanthePony's post with 🤗 20 days ago

🎉 Thrilled to share our #CVPR2025 accepted work: https://huggingface.co/papers/2503.21694 🔥 Key Innovations: 1️⃣ First to adapt SD for direct textured mesh generation (1-2s inference) 2️⃣ Novel teacher-student framework leveraging multi-view diffusion models ([MVDream](https://arxiv.org/abs/2308.16512) & [RichDreamer](https://arxiv.org/abs/2311.16918)) 3️⃣ Parameter-efficient tuning - only +2.6% params over base SD 4️⃣ 3D data-free training liberates model from dataset constraints 💡 Why matters? → A novel 3D-Data-Free paradigm → Outperforms data-driven methods on creative concept generation → Unlocks web-scale text corpus for 3D content creation 🌐 Project: https://theericma.github.io/TriplaneTurbo/ 🎮 Demo: https://huggingface.co/spaces/ZhiyuanthePony/TriplaneTurbo 💻 Code: https://github.com/theEricMa/TriplaneTurbo

View all activity

Organizations

Chunte's activity

reacted to merterbak's post with 👀🔥 12 days ago

Post

4662

Qwen 3 can launch very soon. 👀

https://github.com/ggml-org/llama.cpp/pull/12828

3 replies

reacted to ZhiyuanthePony's post with 🤗 20 days ago

Post

2581

🎉 Thrilled to share our #CVPR2025 accepted work:
Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data (2503.21694)

🔥 Key Innovations:
1️⃣ First to adapt SD for direct textured mesh generation (1-2s inference)
2️⃣ Novel teacher-student framework leveraging multi-view diffusion models ([MVDream](https://arxiv.org/abs/2308.16512) & [RichDreamer](https://arxiv.org/abs/2311.16918))
3️⃣ Parameter-efficient tuning - only +2.6% params over base SD
4️⃣ 3D data-free training liberates model from dataset constraints

💡 Why matters?
→ A novel 3D-Data-Free paradigm
→ Outperforms data-driven methods on creative concept generation
→ Unlocks web-scale text corpus for 3D content creation

🌐 Project: https://theericma.github.io/TriplaneTurbo/
🎮 Demo: ZhiyuanthePony/TriplaneTurbo
💻 Code: https://github.com/theEricMa/TriplaneTurbo

liked a Space 20 days ago

elder-derp

🐳

reacted to OFT's post with 😔👀 20 days ago

Post

1903

HF's new system makes me feel like they are not transparent on their pricing and making me feel they are not trustworthy.

6 replies

reacted to AdinaY's post with 🔥 20 days ago

Post

1948

AReal-Boba 🔥 a fully open RL Frameworks released by AntGroup, an affiliate company of Alibaba.
inclusionAI/areal-boba-67e9f3fa5aeb74b76dcf5f0a
✨ 7B/32B - Apache2.0
✨ Outperform on math reasoning
✨ Replicating QwQ-32B with 200 data under $200
✨ All-in-one: weights, datasets, code & tech report

1 reply

reacted to MonsterMMORPG's post with 🤯🤝👍🧠➕😎🤗❤️🚀👀🔥 20 days ago

Post

2189

I have Compared Kohya vs OneTrainer for FLUX Dev Finetuning / DreamBooth Training

OneTrainer can train FLUX Dev with Text-Encoders unlike Kohya so I wanted to try it.

Unfortunately, the developer doesn't want to add feature to save trained Clip L or T5 XXL as safetensors or merge them into output so basically they are useless without so much extra effort.

I still went ahead and wanted to test EMA training. EMA normally improves quality significantly in SD 1.5 training. With FLUX I have to use CPU for EMA and it was really slow but i wanted to test.

I have tried to replicate Kohya config. The below you will see results. Sadly the quality is nothing sort of. More research has to be made and since we still don't get text-encoder training due to developer decision, I don't see any benefit of using OneTrainer for FLUX training instead of using Koha.

1st image : Kohya best config : https://www.patreon.com/posts/112099700

2nd image : One Trainer Kohya config with EMA update every 1 step

3rd image : One Trainer Kohya config with EMA update every 5 steps

4th image : One Trainer Kohya config

5th image : One Trainer Kohya config but Timestep Shift is 1 instead of 3.1582

I am guessing that Timestep Shift of OneTrainer is not same as Discrete Flow Shift of Kohya

Probably I need to work and do more test and i can improve results but i don't see any reason to do atm. If Clip Training + merging it into safetensors file was working, I was gonna pursue it

These are not cherry pick results all are from 1st test grid

reacted to nthakur's post with 🔥 20 days ago

Post

1573

Last year, I curated & generated a few multilingual SFT and DPO datasets by translating English SFT/DPO datasets into 9-10 languages using the mistralai/Mistral-7B-Instruct-v0.2 model.

I hope it helps the community for pretraining/instruction tuning multilingual LLMs! I added a small diagram to briefly describe which datasets are added and their sources.

Happy to collaborate in either using these datasets for instruction FT, or wishes to extend translated versions of newer SFT/DPO english datasets!

nthakur/multilingual-sft-and-dpo-datasets-67eaf56fe3feca5a57cf7d74

reacted to luigi12345's post with 🔥 20 days ago

Post

1878

🚀 DEEPSEEK R1… Replicated! 🧠✨
All powered by just ONE system prompt.
Try it. Compare it. See for yourself. 👀
🔥 Even better than the original — with richer, more insightful replies.
🎯 No gimmicks. Just pure AI performance.

[](https://chatgpt.com/g/g-67e5e1e379e88191873752b60f518a14-deepseek-r1-thinking)
PROMPT IN THE COMMENTS

5 replies