Unsupervised training of Mistral for Domain-Specific Inference

#135
by H2dddhxh - opened

I'm interested in fine-tuning Mistral with unsupervised learning for a domain-specific task, using my own text corpus as both input and output for the initial training phase. This aims to prepare Mistral for more targeted, supervised training on specific tasks within my field. I'm facing a challenge due to Mistral's requirement for data in an 'instruction, input, and output' format. Is it possible to start with unsupervised learning in this context, and how can I proceed? Will this unsupervised approach lead to better domain-specific performance for next instruction fine-tune compared to using Mistral without such training?

Hi, can you please explain how you formatted your dataset or link to examples for the base model ? I’ve only seen examples for Instruct

Sign up or log in to comment