Triangle104
/

Tulu-3.69-DPO-8B-Q4_K_M-GGUF

Inference Endpoints

Model card Files Files and versions Community

Triangle104 commited on 20 days ago

Commit

e8037ce

•

1 Parent(s): da6e2b6

Update README.md

Files changed (1) hide show

README.md +132 -0

README.md CHANGED Viewed

@@ -12,6 +12,138 @@ tags:
 This model was converted to GGUF format from [`FourOhFour/Tulu-3.69-DPO-8B`](https://huggingface.co/FourOhFour/Tulu-3.69-DPO-8B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/FourOhFour/Tulu-3.69-DPO-8B) for more details on the model.
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)

 This model was converted to GGUF format from [`FourOhFour/Tulu-3.69-DPO-8B`](https://huggingface.co/FourOhFour/Tulu-3.69-DPO-8B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/FourOhFour/Tulu-3.69-DPO-8B) for more details on the model.
+---
+Model details:
+-
+This is a DPO applied over Tulu-3.69-8B. This model is designed to
+roleplay and converse like a human chat partner. This model follows
+instructions well and excels at playing characters in a realistic and
+entertaining manner.
+For ease of use, try the Llama 3 instruct format. You may need to set a custom stop string for <|end_of_text|>
+For optimal performance I have found that a modified Tulu 3 instruct format is quite effective:
+<|system|>
+This is an instruction.
+<|end_of_text|>
+<|user|>
+This is the user input.
+<|assistant|>
+This is model output.
+<|end_of_text|>
+Further, if you want your bot to have a sense of time, you can set the last output prefix as such:
+<|system|>
+{{time}} {{weekday}} {{date}}
+<|end_of_text|>
+<|assistant|>
+Note: these macros may differ in your chosen inferencing frontend. Please correct accordingly.
+base_model: jeiku/Tulu-3.69-8B
+model_type: AutoModelForCausalLM
+tokenizer_type: AutoTokenizer
+load_in_8bit: false
+load_in_4bit: false
+strict: false
+hub_model_id: jeiku/tuludpo
+hub_strategy: "all_checkpoints"
+push_dataset_to_hub:
+hf_use_auth_token: true
+chat_template: llama3
+rl: dpo
+datasets:
+  - path: antiven0m/physical-reasoning-dpo
+    type: llama3.prompt_pairs
+  - path: nbeerbower/Purpura-DPO
+    type: llama3.prompt_pairs
+  - path: FourOhFour/Human_DPO_Emojis_Removed
+    type: llama3.prompt_pairs
+shuffle_merged_datasets: true
+val_set_size: 0.005
+output_dir: ./outputs/out
+sequence_len: 8192
+sample_packing: false
+eval_sample_packing: false
+pad_to_sequence_len: false
+wandb_project: evil
+wandb_entity:
+wandb_watch:
+wandb_name: evil
+wandb_log_model:
+gradient_accumulation_steps: 16
+micro_batch_size: 2
+num_epochs: 2
+optimizer: adamw_bnb_8bit
+lr_scheduler: cosine
+learning_rate: 0.000005
+weight_decay: 0.05
+train_on_inputs: false
+group_by_length: false
+bf16: auto
+fp16:
+tf32: true
+gradient_checkpointing: true
+early_stopping_patience:
+resume_from_checkpoint:
+local_rank:
+logging_steps: 1
+xformers_attention:
+flash_attention: true
+warmup_steps: 10
+evals_per_epoch: 2
+eval_table_size:
+eval_max_new_tokens:
+saves_per_epoch: 1
+debug:
+deepspeed:
+fsdp:
+fsdp_config:
+special_tokens:
+  pad_token: <|finetune_right_pad_id|>
+---
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)