works but still not ready

#1
by mahen23 - opened

These are not comprehensive answers

image.png

Hi, @mahen23 ! Thank you for trying these models out!
Unfortunately, these tiny models aren't really useful without RAG. They know English, and how to read/elaborate sentences, but due to their size, their knowledge is limited and convoluted.
They were created mainly for understanding provided contexts and providing you an answer/summary based on that context, like in MiniSearch [Source Code], where we provide them with snippets from the Internet and let it create an answer derived from the results.
It's also important to note that the smaller the model, the more objective has to be your system/user prompt, to reduce the chances of hallucination. The decoding strategy you choose is also important. For instance, for single questions, you'll want to use Contrastive Search, but for long chats, you'll prefer using Multinomial Sampling.
For better results, I recommend taking TinyLlama-1.1B or Qwen1.5-0.5B as your baseline, as they have been trained in a much larger dataset, for much more time.
I hope this info can help you better fit these models into your project.

Sign up or log in to comment