BTBurke commited on
Commit
337c57d
1 Parent(s): b4a8d5f

Model save

Browse files
Files changed (1) hide show
  1. README.md +72 -0
README.md ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: peft
4
+ tags:
5
+ - generated_from_trainer
6
+ base_model: mistralai/Mixtral-8x7B-v0.1
7
+ model-index:
8
+ - name: mixtral-8x7B-2c-v0.1
9
+ results: []
10
+ ---
11
+
12
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
+ should probably proofread and complete it, then remove this comment. -->
14
+
15
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
16
+ # mixtral-8x7B-2c-v0.1
17
+
18
+ This model is a fine-tuned version of [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) on an unknown dataset.
19
+
20
+ ## Model description
21
+
22
+ More information needed
23
+
24
+ ## Intended uses & limitations
25
+
26
+ More information needed
27
+
28
+ ## Training and evaluation data
29
+
30
+ More information needed
31
+
32
+ ## Training procedure
33
+
34
+ ### Training hyperparameters
35
+
36
+ The following hyperparameters were used during training:
37
+ - learning_rate: 0.0002
38
+ - train_batch_size: 1
39
+ - eval_batch_size: 1
40
+ - seed: 42
41
+ - gradient_accumulation_steps: 2
42
+ - total_train_batch_size: 2
43
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
+ - lr_scheduler_type: cosine
45
+ - lr_scheduler_warmup_steps: 10
46
+ - num_epochs: 1
47
+
48
+ ### Framework versions
49
+
50
+ - Transformers 4.37.0.dev0
51
+ - Pytorch 2.0.1+cu118
52
+ - Datasets 2.15.0
53
+ - Tokenizers 0.15.0
54
+ ## Training procedure
55
+
56
+
57
+ The following `bitsandbytes` quantization config was used during training:
58
+ - quant_method: bitsandbytes
59
+ - load_in_8bit: False
60
+ - load_in_4bit: True
61
+ - llm_int8_threshold: 6.0
62
+ - llm_int8_skip_modules: None
63
+ - llm_int8_enable_fp32_cpu_offload: False
64
+ - llm_int8_has_fp16_weight: False
65
+ - bnb_4bit_quant_type: nf4
66
+ - bnb_4bit_use_double_quant: True
67
+ - bnb_4bit_compute_dtype: bfloat16
68
+
69
+ ### Framework versions
70
+
71
+
72
+ - PEFT 0.6.0