Liveme
/

T3Q-Mistral-Orca-Math-DPO_qlora_20240516-2018_4bit_merge

Model card Files Files and versions Community

quan26 commited on Jun 3

Commit

0f5fdba

•

1 Parent(s): 96c6f2b

Create README.md

Files changed (1) hide show

README.md +50 -0

README.md ADDED Viewed

	@@ -0,0 +1,50 @@

+---
+library_name: peft
+base_model: chihoonlee10/T3Q-Mistral-Orca-Math-DPO
+---
+# Model Card for Model ID
+推理配置需要注意的几个参数：
+```
+params = {
+  'temperature': 0.85,
+  'top_p': 0.95,
+  'top_k': 20,
+  'repetition_penalty': 1.18,
+  'max_tokens': 500,
+  'stop': [],
+  'typical_p': 0.95,
+  'n': 1,
+}
+```
+Prompt模板格式
+```
+### Instruction:
+<prompt> (without the <>)
+### Response:
+```
+训练参数(使用Llama-Factory训练)：
+```
+- learning_rate: 5e-05
+- lr_scheduler_type: cosine
+- per_device_train_batch_size: 1
+- per_device_eval_batch_size: 1
+- gradient_accumulation_steps: 2
+- warmup_steps: 24
+- num_train_epochs: 2
+- template: alpaca
+- cutoff_len: 4096
+- finetuning_type: lora
+- lora_target: q_proj,v_proj,o_proj,k_proj
+- quantization_bit: 4
+- lora_rank: 64
+- lora_alpha: 16
+- bf16: True
+- logging_steps: 20
+- val_size: 4
+- save_steps: 200
+```