Post
1526
How talkative is your chatbot about your internal data? 😬
As more chatbots get deployed in production, with access to internal databases, we need to make sure they don't leak private information to anyone interacting with them.
The Lighthouz AI team therefore introduced the Chatbot Guardrails Arena to stress test models and see how well guarded your private information is.
Anyone can try to make models reveal information they should not share 😈
(which is quite fun to do for the strongest models)!
The votes will then be gathered to create an Elo ranking of the safest models with respect to PII.
In the future, with the support of the community, this arena could inform safety choices that company make, when choosing models and guardrails on their resistance to adversarial attacks.
It's also a good way to easily demonstrate the limitations of current systems!
Check out the arena: lighthouzai/guardrails-arena
Learn more in the blog: https://huggingface.co/blog/arena-lighthouz
As more chatbots get deployed in production, with access to internal databases, we need to make sure they don't leak private information to anyone interacting with them.
The Lighthouz AI team therefore introduced the Chatbot Guardrails Arena to stress test models and see how well guarded your private information is.
Anyone can try to make models reveal information they should not share 😈
(which is quite fun to do for the strongest models)!
The votes will then be gathered to create an Elo ranking of the safest models with respect to PII.
In the future, with the support of the community, this arena could inform safety choices that company make, when choosing models and guardrails on their resistance to adversarial attacks.
It's also a good way to easily demonstrate the limitations of current systems!
Check out the arena: lighthouzai/guardrails-arena
Learn more in the blog: https://huggingface.co/blog/arena-lighthouz