adamo1139
/

Yi-34B-200K-AEZAKMI-v2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

adamo1139 commited on Dec 19, 2023

Commit

cd4124c

•

1 Parent(s): cc83da6

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ Base model used for fine-tuning was 200k context Yi-34B-Llama model shared by la
 I had to change max_positional_embeddings in config.json and model_max_length to 4096 for training to start, otherwise I was OOMing straight away.
 My first attempt had max_positional_embeddings set to 16384 and model_max_length set to 200000. This allowed fine-tuning to finish, but that model was broken after applying LoRA and merging it. \
 Please use my lesson and be careful when setting positional embeddings for training \
-<b>This model is a third attempt with AEZAKMI v2 dataset and it works perfectly fine.</b>
 ## Prompt Format

 I had to change max_positional_embeddings in config.json and model_max_length to 4096 for training to start, otherwise I was OOMing straight away.
 My first attempt had max_positional_embeddings set to 16384 and model_max_length set to 200000. This allowed fine-tuning to finish, but that model was broken after applying LoRA and merging it. \
 Please use my lesson and be careful when setting positional embeddings for training \
+<b>This model is my third attempt with AEZAKMI v2 dataset and it works perfectly fine.</b>
 ## Prompt Format