wzhouad
/

zephyr-7b-dpo-full

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

wzhouad commited on May 7

Commit

30bf93e

•

1 Parent(s): be0c02c

Training in progress, step 1200

Files changed (3) hide show

README.md +6 -5
config.json +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -2,20 +2,21 @@
 license: mit
 base_model: HuggingFaceH4/mistral-7b-sft-beta
 tags:
-- trl
-- dpo
 - generated_from_trainer
 model-index:
-- name: zephyr-7b-dpo-full
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# zephyr-7b-dpo-full
-This model is a fine-tuned version of [HuggingFaceH4/mistral-7b-sft-beta](https://huggingface.co/HuggingFaceH4/mistral-7b-sft-beta) on the None dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.0260
 - Rewards/chosen: -2.1934

 license: mit
 base_model: HuggingFaceH4/mistral-7b-sft-beta
 tags:
+- alignment-handbook
 - generated_from_trainer
+datasets:
+- HuggingFaceH4/hh-rlhf-h4
 model-index:
+- name: baseline2
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# baseline2
+This model is a fine-tuned version of [HuggingFaceH4/mistral-7b-sft-beta](https://huggingface.co/HuggingFaceH4/mistral-7b-sft-beta) on the HuggingFaceH4/hh-rlhf-h4 dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.0260
 - Rewards/chosen: -2.1934

config.json CHANGED Viewed

@@ -20,6 +20,6 @@
   "tie_word_embeddings": false,
   "torch_dtype": "bfloat16",
   "transformers_version": "4.35.2",
-  "use_cache": false,
   "vocab_size": 32000
 }

   "tie_word_embeddings": false,
   "torch_dtype": "bfloat16",
   "transformers_version": "4.35.2",
+  "use_cache": true,
   "vocab_size": 32000
 }

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0bb88c129b5447ee34b7f3cdb54f7532418c2781e6bdc17cbf7942a9a1e218bd
 size 5944

 version https://git-lfs.github.com/spec/v1
+oid sha256:ef3f3bcb1d637ffd73632ad00af47d3006ac1e6c1f0c109c90bd802bdaba6dcd
 size 5944