Релиз вихря 0.5

Долили сильно больше данных в sft, теперь стабильнее работает json и multiturn, слегка подточили параметры претрена модели

Added a lot more data to sft, now json and multiturn work more stable on long context and hard prompts

Google Colab
GGUF


@article{nikolich2024vikhr,
  title={Vikhr: The Family of Open-Source Instruction-Tuned Large Language Models for Russian},
  author={Aleksandr Nikolich and Konstantin Korolev and Artem Shelmanov},
  journal={arXiv preprint arXiv:2405.13929},
  year={2024},
  url={https://arxiv.org/pdf/2405.13929}
}