What is the split used to report results on OpenAI Moderation Dataset?

#4
by avanigupta - opened

It seems the model is trained/finetuned on evaluation OpenAI Moderation Dataset: https://huggingface.co/datasets/mmathys/openai-moderation-api-evaluation

There are some validation metric scores mentioned here: https://huggingface.co/KoalaAI/Text-Moderation#validation-metrics

I want to ask could you provide the split you used for it? I am not able to replicate your scores on entire https://huggingface.co/datasets/mmathys/openai-moderation-api-evaluation dataset.

@KoalaAI could you please comment on it?

This comment has been hidden
Koala AI org

Hi! Sorry for the delayed response-- I don't get notifications from this org.

This dataset was used as a base, it was modified to fit within the requirements of AutoTrain; which has since been axed so I'm not sure I still have the variant training data split.
I still have the script used to modify the training data, but the split was randomly made by AT during training.

Sign up or log in to comment