Train Mistral 7B 0.2

by mosama - opened Jan 10, 2024

Jan 10, 2024

Why don't you guys train mistral 7b 0.2 which has 32k context length on long context as well as short? Long Context datasets such as:

wckwan/M4LE
THUDM/LongBench
togethercomputer/Long-Data-Collections

or maybe your own long context curated ones.

mosama changed discussion title from Trainv Mistral 7B 0.2 to Train Mistral 7B 0.2 Jan 10, 2024

rombodawg

Jan 10, 2024

Yea i agree, I was considering using this model in a mixtral-merge becuase of the scores but it would difficult considering the context constraints of only 8k. Making any other mistral model in the merge be limited to 8k despite being able to produce 32k tokens of content.

carlos447

Jan 12, 2024

Citaman

Jan 15, 2024

I would say that the Mistral 7B V0.2 is not a pretrained model, but an instruction-tuned one, and therefore already has a bias towards the finetuning phase. For complete control over the model's performance, it is best to start from a pretrained model. That may be why.

riddlechen

Jan 17, 2024

SvCy

Jan 22, 2024

•

edited Jan 22, 2024

~~nvm~~
and i think they should definitely go beyond 7B parameters with openchat!

count-zero

Mar 27, 2024

•

edited Mar 27, 2024

https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2 is fine-tuned on the base model mistral-7B-v0.2, which is now officially made available by Mistral AI:

mistral-7B-v0.2

https://models.mistralcdn.com/mistral-7b-v0-2/mistral-7B-v0.2.tar (PyTorch)
https://huggingface.co/alpindale/Mistral-7B-v0.2-hf (Safetensors)
https://huggingface.co/bartowski/Mistral-7B-v0.2-hf-GGUF (GGUF)

I would love to see an OpenChat fine-tune based on mistral-7B-v0.2 with a 32k context length.

Joseph717171

Apr 3, 2024

•

edited Apr 3, 2024

OpenChat team, I Depth Up-Scaled Mistral-7B-v0.2, following UpStage’s paper: SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling, if you want to train OpenChat on a slightly bigger model.

Joseph717171/Mistral-10.7B-v0.2

32K Context Window
🚫 Sliding Window Attention

rombodawg

Apr 3, 2024

@Joseph717171 Your too late bro, they dont care

https://huggingface.co/openchat/openchat-3.5-0106-gemma/discussions/4

Joseph717171

Apr 3, 2024

Oh, well it was worth a shot. 😁

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment