Running on Zero 10 π¦Ύπͺπ½ Human Feedback Collector | Meta-Llama-3.1-8B-Instruct | (DPO) LLM, chatbot, human-feedback
argilla/ultrafeedback-binarized-preferences-cleaned Viewer β’ Updated Dec 11, 2023 β’ 60.9k β’ 4.37k β’ 109
Ray2333/reward-model-Mistral-7B-instruct-Unified-Feedback Text Classification β’ Updated 27 days ago β’ 1.13k β’ 11