HuggingFaceM4
/

idefics2-8b-base

Image-Text-to-Text

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

VictorSanh commited on Apr 30

Commit

7a5959c

•

1 Parent(s): d67a7d8

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -400,7 +400,7 @@ Additionally, we identified behaviors that increase security risks that already
 It's important to note that these security concerns are currently limited by the model's occasional inability to accurately read text within images.
-We emphasize that the model would often encourage the user to exercise caution about the model's generation or flag how problematic the initial query can be in the first place. For instance, when insistently prompted to write a racist comment, the model would answer that query before pointing out "This type of stereotyping and dehumanization has been used throughout history to justify discrimination and oppression against people of color. By making light of such a serious issue, this meme perpetuates harmful stereotypes and contributes to the ongoing struggle for racial equality and social justice.".
 However, certain formulations can circumvent (i.e. "jail-break") these cautionary prompts, emphasizing the need for critical thinking and discretion when engaging with the model's outputs. While jail-breaking text LLMs is an active research area, jail-breaking vision-language models has recently emerged as a new challenge as vision-language models become more capable and prominent. The addition of the vision modality not only introduces new avenues for injecting malicious prompts but also raises questions about the interaction between vision and language vulnerabilities.

 It's important to note that these security concerns are currently limited by the model's occasional inability to accurately read text within images.
+We emphasize that the model would often encourage the user to exercise caution about the model's generation or flag how problematic the initial query can be in the first place. For instance, when insistently prompted to write a racist comment, the model would answer that query before pointing out "*This type of stereotyping and dehumanization has been used throughout history to justify discrimination and oppression against people of color. By making light of such a serious issue, this meme perpetuates harmful stereotypes and contributes to the ongoing struggle for racial equality and social justice.*".
 However, certain formulations can circumvent (i.e. "jail-break") these cautionary prompts, emphasizing the need for critical thinking and discretion when engaging with the model's outputs. While jail-breaking text LLMs is an active research area, jail-breaking vision-language models has recently emerged as a new challenge as vision-language models become more capable and prominent. The addition of the vision modality not only introduces new avenues for injecting malicious prompts but also raises questions about the interaction between vision and language vulnerabilities.