Questions from noob

#1
by lordfervi - opened

I stumbled upon your profile by accident.

I'm wondering how you add so many parameters to these models—is it a lot of fine-tuning?

My DGX Spark is supposed to arrive soon, and I'm really looking forward to seeing if the model is significantly better.

@lordfervi ive been going at this for 2 years now, its pretty complicated and I still work on perfecting it. do you want me to make you one? feedback is super welcome. enjoy.

@LLMWildling
Let's just say this, we'll stay in touch :)

People usually do various types of model optimization. Your models look like they do some fine-tuning.

I'm afraid we'll reach a point where you'll have enterprise models in the cloud (OpenAI, Claude), some large open-source models (like DeepSeek), but not for consumers (too expensive), you'll have small models (like Gemma), but very few mid-tier models (like Mistral Small 4, GPT-OSS, etc.).

If it's possible to "easily" improve AI models, I think it's revolutionary.

I accidentally found your profile and see that you've made a lot of models much larger. I'm wondering if it works correctly and so on.

Unfortunately, I don't have the DGX Spark for now (I hope I'll have it next week). I'll let you know when I can test it.

Maybe in the future you'll manage to slightly (even not significantly) tune Mistral Small 4 ;)

I just checked my pipeline and but pretty sure it's pre/post training, the full run. I use my own optimizer to speed things up.

"If it's possible to "easily" improve AI models, I think it's revolutionary." - thats the idea, but things take time to perfect. so any feedback from this community is welcome, I have a lot more coming.

Mistral Small 4 - do you want a bigger version? @lordfevi

Sign up or log in to comment