dyang415 commited on
Commit
1970205
·
verified ·
1 Parent(s): a69b3aa

End of training

Browse files
Files changed (2) hide show
  1. README.md +5 -5
  2. adapter_model.bin +1 -1
README.md CHANGED
@@ -4,7 +4,7 @@ library_name: peft
4
  tags:
5
  - axolotl
6
  - generated_from_trainer
7
- base_model: mistralai/Mixtral-8x7B-v0.1
8
  model-index:
9
  - name: mixtral-fc-op
10
  results: []
@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  axolotl version: `0.4.0`
20
  ```yaml
21
- base_model: mistralai/Mixtral-8x7B-v0.1
22
  model_type: AutoModelForCausalLM
23
  tokenizer_type: LlamaTokenizer
24
  trust_remote_code: true
@@ -77,7 +77,7 @@ lora_target_modules:
77
 
78
  gradient_accumulation_steps: 2
79
  micro_batch_size: 1
80
- num_epochs: 2
81
  optimizer: paged_adamw_8bit
82
  lr_scheduler: cosine
83
  learning_rate: 0.0002
@@ -111,7 +111,7 @@ fsdp_config:
111
 
112
  # mixtral-fc-op
113
 
114
- This model is a fine-tuned version of [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) on the None dataset.
115
 
116
  ## Model description
117
 
@@ -155,7 +155,7 @@ The following hyperparameters were used during training:
155
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
156
  - lr_scheduler_type: cosine
157
  - lr_scheduler_warmup_steps: 10
158
- - num_epochs: 2
159
 
160
  ### Training results
161
 
 
4
  tags:
5
  - axolotl
6
  - generated_from_trainer
7
+ base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
8
  model-index:
9
  - name: mixtral-fc-op
10
  results: []
 
18
 
19
  axolotl version: `0.4.0`
20
  ```yaml
21
+ base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
22
  model_type: AutoModelForCausalLM
23
  tokenizer_type: LlamaTokenizer
24
  trust_remote_code: true
 
77
 
78
  gradient_accumulation_steps: 2
79
  micro_batch_size: 1
80
+ num_epochs: 0.1
81
  optimizer: paged_adamw_8bit
82
  lr_scheduler: cosine
83
  learning_rate: 0.0002
 
111
 
112
  # mixtral-fc-op
113
 
114
+ This model is a fine-tuned version of [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) on the None dataset.
115
 
116
  ## Model description
117
 
 
155
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
156
  - lr_scheduler_type: cosine
157
  - lr_scheduler_warmup_steps: 10
158
+ - num_epochs: 0.1
159
 
160
  ### Training results
161
 
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:edbb0c5a9a47593a2fa54e50eca0eb9bd36b2e134fa5e023fa03b8f24626cc61
3
  size 27354957
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:35022410b2473b288182ea633c51ffbab225dca82ab0d755107039abc5c6e7a5
3
  size 27354957