UCLA-AGI
/

zephyr-7b-sft-full-SPIN-iter1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ydeng9 commited on Jan 5

Commit

88664b0

•

1 Parent(s): 9257b64

Update README.md

Files changed (1) hide show

README.md +12 -10

README.md CHANGED Viewed

@@ -7,6 +7,8 @@ language:
 base_model: mistralai/Mistral-7B-v0.1
 pipeline_tag: text-generation
 ---
 # zephyr-7b-sft-full-spin-iter1
 This model is a self-play fine-tuned model at iteration 1 from [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) using synthetic data based on on the [HuggingFaceH4/ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) dataset.
@@ -23,16 +25,16 @@ This model is a self-play fine-tuned model at iteration 1 from [alignment-handbo
 ### Training hyperparameters
 The following hyperparameters were used during training:
-learning_rate: 5e-07
-train_batch_size: 8
-seed: 42
-distributed_type: multi-GPU
-num_devices: 8
-total_train_batch_size: 64
-optimizer: RMSProp
-lr_scheduler_type: linear
-lr_scheduler_warmup_ratio: 0.1
-num_epochs: 2.0
 ## Citation
 ```

 base_model: mistralai/Mistral-7B-v0.1
 pipeline_tag: text-generation
 ---
+see our paper in https://arxiv.org/abs/2401.01335
 # zephyr-7b-sft-full-spin-iter1
 This model is a self-play fine-tuned model at iteration 1 from [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) using synthetic data based on on the [HuggingFaceH4/ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) dataset.
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 5e-07
+- train_batch_size: 8
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 8
+- total_train_batch_size: 64
+- optimizer: RMSProp
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 2.0
 ## Citation
 ```