question / hint

#1
by kalle07 - opened

ok... my test was to simple it give minor differences , but ...

You put so much effort into merging and combining, but if you have no idea why, and what could be different or where is the advantage ?

  • language china/english ???
  • more chat or more instruct ???
  • temperature, top_p, top_k ???
  • llm test / benchmark results ?
    i mean it give at least 20 very good models, why should someone try yours ? ^^
    should the "community" find out?

I am open to any suggestions:

  • if you have candidate models you would like to merge
  • different merging techniques

The bottom line is, merging is very new! I just started evaluating these merges one by one via llm-harness. And they perform very different than the original models. The reason as to why I uploaded them (mostly personal):

  • Storage management! I need to keep clean up my disks, but this way I am not losing the models and if I need them I can re-download them
  • Easy to share. I use one machine to merge, another to evaluate so I just provide the name of the model instead of constantly move files
  • And lastly, if it helps the community to try them and come to any conclusion, why not. These models do not say "use me, I am better!". They are here for my research purposes and most of them might end up with the same accuracy and global capabilities or even worse in some areas. But the ones that perform better/different is why I do this.

I am planning to move to merge of different languages, different context lengths, and different tasks/instructions soon. (the later on maybe MoE) If you have a list of candidates, merging techniques, or anything you think it would be interested please do let me know. I am very open to give them a shot. I use this library https://github.com/cg123/mergekit

i see ...

iam not a merge-expert ;)
but i took a quick look at this mergekit, but i dont fully understand which kind of model are right ?
so you need the (uncompiled ones) no gguf, gptq ?
you can give me a hint?

btw if this files are larger than 20GB my internet is too slow ;)

these are some models:
Marcoroni, Hermes, Redmond-Puffin, OpenHermes, SlimOrca, Orca, supermario(this seem also private)

for my understanding it would be usefull to have more chat(human) and more instruct(knowlidge) models, so you have the choice to work with them

At the moment I'm more interested in testing models to see how they work with documents. unfortunately there are currently only 2-3 programs for this (more or less bad)
gpt4all, privateGPT and superbooga for text-generation-webui

you have any experience in lora training in text-generation-webui, is it such as simple that you can provide plain text to learn the content of some books ?

THX, cu

Of course!

  • I don't believe anyone merged quantized models and got anything interesting. It's always better to work with 16/32 bit, and then quantize the result. So I go with the default that loads the models in 16bit
  • Marcoroni, Hermes, Redmond-Puffin, OpenHermes, SlimOrca, Orca, supermario I actually really like some of these! So I will attack the original models first, merge them with different fine-tuned models from the same architecture.
  • I am going to find out if we have models based on let's say Hermes that are purely chat(human) and purely instruct(knowlidge) and merge them with different techniques, evaluate, and see the results. Different languages, different fine-tune techniques are interesting for merges
  • I have done fine-tuning, but it was mostly prompt-based via instructions. To learn from a plain text, usually RAG would be a better option to pair some chunks with prompt/instructions to mimic the tone/style of writing, etc.

hey... omg i could not write 4 days because there was a fault(in hugginface) with newcommers . . .

okay that looks like you have a plan ;)
what RAG software you use ?
all i tryed is GPT4all and privateGPT, the superbooga extention in obadooga dont work.
you have any suggestion, but i dont want use any acces or keys and it would be nice if it had a small GUI.

btw all these based models have the knowledge of at least "wiki", so why is it so complicate to add some PDF books to any model ?

Of course!

  • I don't believe anyone merged quantized models and got anything interesting. It's always better to work with 16/32 bit, and then quantize the result. So I go with the default that loads the models in 16bit
  • Marcoroni, Hermes, Redmond-Puffin, OpenHermes, SlimOrca, Orca, supermario I actually really like some of these! So I will attack the original models first, merge them with different fine-tuned models from the same architecture.
  • I am going to find out if we have models based on let's say Hermes that are purely chat(human) and purely instruct(knowlidge) and merge them with different techniques, evaluate, and see the results. Different languages, different fine-tune techniques are interesting for merges
  • I have done fine-tuning, but it was mostly prompt-based via instructions. To learn from a plain text, usually RAG would be a better option to pair some chunks with prompt/instructions to mimic the tone/style of writing, etc.

hey... omg i could not write 4 days because there was a fault(in hugginface) with newcommers . . .

okay that looks like you have a plan ;)
what RAG software you use ?
all i tryed is GPT4all and privateGPT, the superbooga extention in obadooga dont work.
you have any suggestion, but i dont want use any accesses or keys and it would be nice if it had a small GUI.

btw all these based models have the knowledge of at least "wiki", so why is it so complicate to add some PDF books to any model ?
another question, you have any idea what embedings models for RAG are and how the pytorch model quantized to gguf ?

Sign up or log in to comment