Finetune on "uncensored" dataset?

#32

by sivarajan - opened Jun 1, 2023

Jun 1, 2023

The datasets used for fine-tuning the model introduce significant bias in responses, and marked reduction in capability, famously with the verbal tic "I'm sorry, but as a large language model … ". Have you considered finetuning Falcon on datasets with such responses removed?
See evol_instruct_unfiltered and ShareGPT_unfiltered.

joorei

Jun 1, 2023

That would be amazing!

The censored models are not only biased, but as a result less useful.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment