HumanF-MarkrAI
/

COKAL-DPO-13b-v2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

DopeorNope commited on Nov 11, 2023

Commit

ba3a216

•

1 Parent(s): af0633b

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -24,9 +24,9 @@ license: cc-by-nc-sa-4.0
 **Output** Models generate text only.
 **Model Architecture**
-COKAL-DPO_test-v2 is an auto-regressive 13B language model based on the LLaMA2 transformer architecture.
-**Base Model**  [DopeorNope/COKAL_pre_DPO_Test_v1-13b](https://huggingface.co/DopeorNope/COKAL_pre_DPO_Test_v1-13b)
 DopeorNope/COKAL_pre_DPO_Test_v2-13b is the SFT model to train with DPO methodology.
@@ -40,7 +40,7 @@ This dataset was constructed by directly collecting and reorganizing data by Dop
 This dataset is based on ["kyujinpy/OpenOrca-KO"](https://huggingface.co/datasets/kyujinpy/OpenOrca-KO) and has been processed using the Near Dedup algorithm to remove items with a Jaccard Similarity threshold of 0.8 or higher. In addition, inconsistent inputs have been cleaned and modified.
 **Training**
-The difference between "DopeorNope/COKAL-DPO_test-v2" and this model is that this model has different hyperparameters from the one in that setting when it comes to the final version.
 I developed the model in an environment with four RTX 3090 GPUs running Ubuntu 18.04.

 **Output** Models generate text only.
 **Model Architecture**
+COKAL-DPO_13b-v2 is an auto-regressive 13B language model based on the LLaMA2 transformer architecture.
+**Base Model**  [DopeorNope/COKAL_pre_DPO_Test_v2-13b](https://huggingface.co/DopeorNope/COKAL_pre_DPO_Test_v2-13b)
 DopeorNope/COKAL_pre_DPO_Test_v2-13b is the SFT model to train with DPO methodology.
 This dataset is based on ["kyujinpy/OpenOrca-KO"](https://huggingface.co/datasets/kyujinpy/OpenOrca-KO) and has been processed using the Near Dedup algorithm to remove items with a Jaccard Similarity threshold of 0.8 or higher. In addition, inconsistent inputs have been cleaned and modified.
 **Training**
+The difference between "DopeorNope/COKAL-DPO_test-v2" and this model is that this model has different hyper-parameters from the one in that setting regarding the final version.
 I developed the model in an environment with four RTX 3090 GPUs running Ubuntu 18.04.