Output Guardrail Annotated Datasets Collection Collection of annotated output guardrail datasets (Mainly Benchmarking datasets) • 11 items • Updated about 14 hours ago
GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering Paper • 2409.06595 • Published Sep 10, 2024 • 38
PrimeGuard: Safe and Helpful LLMs through Tuning-Free Routing Paper • 2407.16318 • Published Jul 23, 2024 • 7
PrimeGuard: Safe and Helpful LLMs through Tuning-Free Routing Paper • 2407.16318 • Published Jul 23, 2024 • 7
Does fine-tuning GPT-3 with the OpenAI API leak personally-identifiable information? Paper • 2307.16382 • Published Jul 31, 2023
CONSCENDI: A Contrastive and Scenario-Guided Distillation Approach to Guardrail Models for Virtual Assistants Paper • 2304.14364 • Published Apr 27, 2023
Does fine-tuning GPT-3 with the OpenAI API leak personally-identifiable information? Paper • 2307.16382 • Published Jul 31, 2023