chargoddard
/

servile-harpsichord-cdpo

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

chargoddard commited on Dec 10, 2023

Commit

bea3712

·

1 Parent(s): 4415449

Create README.md

Files changed (1) hide show

README.md +16 -0

README.md ADDED Viewed

	@@ -0,0 +1,16 @@

+---
+license: cc-by-nc-4.0
+datasets:
+- pankajmathur/orca_mini_v1_dataset
+- openai/summarize_from_feedback
+- PygmalionAI/PIPPA
+- chargoddard/rpguild
+- lemonilia/LimaRP
+- PKU-Alignment/PKU-SafeRLHF
+- Intel/orca_dpo_pairs
+- argilla/ultrafeedback-binarized-preferences
+---
+Trained on a different random sampling of the same datasets used by [loyal-piano-m7](https://huggingface.co/chargoddard/loyal-piano-m7), then with cDPO on a blend of RLHF datasets.
+Uses the Alpaca prompt format.