dyang415
/

mixtral-fc-op

Generated from Trainer

4-bit precision

Model card Files Files and versions Metrics Training metrics Community

dyang415 commited on Feb 29, 2024

Commit

1970205

·

verified ·

1 Parent(s): a69b3aa

End of training

Files changed (2) hide show

README.md +5 -5
adapter_model.bin +1 -1

README.md CHANGED Viewed

@@ -4,7 +4,7 @@ library_name: peft
 tags:
 - axolotl
 - generated_from_trainer
-base_model: mistralai/Mixtral-8x7B-v0.1
 model-index:
 - name: mixtral-fc-op
   results: []
@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
 axolotl version: `0.4.0`
 ```yaml
-base_model: mistralai/Mixtral-8x7B-v0.1
 model_type: AutoModelForCausalLM
 tokenizer_type: LlamaTokenizer
 trust_remote_code: true
@@ -77,7 +77,7 @@ lora_target_modules:
 gradient_accumulation_steps: 2
 micro_batch_size: 1
-num_epochs: 2
 optimizer: paged_adamw_8bit
 lr_scheduler: cosine
 learning_rate: 0.0002
@@ -111,7 +111,7 @@ fsdp_config:
 # mixtral-fc-op
-This model is a fine-tuned version of [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) on the None dataset.
 ## Model description
@@ -155,7 +155,7 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 10
-- num_epochs: 2
 ### Training results

 tags:
 - axolotl
 - generated_from_trainer
+base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
 model-index:
 - name: mixtral-fc-op
   results: []
 axolotl version: `0.4.0`
 ```yaml
+base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
 model_type: AutoModelForCausalLM
 tokenizer_type: LlamaTokenizer
 trust_remote_code: true
 gradient_accumulation_steps: 2
 micro_batch_size: 1
+num_epochs: 0.1
 optimizer: paged_adamw_8bit
 lr_scheduler: cosine
 learning_rate: 0.0002
 # mixtral-fc-op
+This model is a fine-tuned version of [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) on the None dataset.
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 10
+- num_epochs: 0.1
 ### Training results

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:edbb0c5a9a47593a2fa54e50eca0eb9bd36b2e134fa5e023fa03b8f24626cc61
 size 27354957

 version https://git-lfs.github.com/spec/v1
+oid sha256:35022410b2473b288182ea633c51ffbab225dca82ab0d755107039abc5c6e7a5
 size 27354957