On-demand answers available? [guardrailing]
#156
by
Quetzalcoatl-homotopy
- opened
Context:
from the last sentence in the Mistral documentation here:
"The answers of Mistral 7B-Instruct without prompt and with Mistral prompts are available on demand as they contain examples of text that may be considered unsafe, offensive, or upsetting."
General question: what does "on demand" mean here?
Question 1: how can we get a sample of their prompt, answer pairs for Mistral 7B-Instruct that were "considered unsafe, offensive"?
Question 2: can we get a sample of the dataset alluded to in content moderation: "We evaluated self-reflection on our manually curated and balanced dataset of adversarial and standard prompts and got a precision of 99.4% for a recall of 95.6% (considering acceptable prompts as positives)."?
Quetzalcoatl-homotopy
changed discussion title from
On-demand answers available? [gaurdailing]
to On-demand answers available? [guardrailing]