inkpad commited on
Commit
f6f2e31
·
unverified ·
1 Parent(s): f257352

updated README: x2 to x1

Browse files
Files changed (1) hide show
  1. README.md +1 -2
README.md CHANGED
@@ -190,7 +190,6 @@ Is the user message harmful based on the risk definition? Your answer must be ei
190
 
191
  - Granite Guardian models must <ins>only</ins> be used strictly for the prescribed scoring mode, which generates yes/no outputs based on the specified template. Any deviation from this intended use may lead to unexpected, potentially unsafe, or harmful outputs. The model may also be prone to such behaviour via adversarial attacks.
192
  - The model is targeted for risk definitions of general harm, social bias, profanity, violence, sexual content, unethical behavior, harm engagement, evasiveness, jailbreaking, groundedness/relevance for retrieval-augmented generation, and function calling hallucinations for agentic workflows. It is also applicable for use with custom risk definitions, but these require testing.
193
- It is also applicable for use with custom risk definitions, but these require testing.
194
  - The model is only trained and tested on English data.
195
  - Given their parameter size, the main Granite Guardian models are intended for use cases that require moderate cost, latency, and throughput such as model risk assessment, model observability and monitoring, and spot-checking inputs and outputs.
196
  Smaller models, like the [Granite-Guardian-HAP-38M](https://huggingface.co/ibm-granite/granite-guardian-hap-38m) for recognizing hate, abuse and profanity can be used for guardrailing with stricter cost, latency, or throughput requirements.
@@ -252,4 +251,4 @@ The model performance is evaluated on sample conversations taken from the [DICES
252
  primaryClass={cs.CL},
253
  url={https://arxiv.org/abs/2412.07724},
254
  }
255
- ```
 
190
 
191
  - Granite Guardian models must <ins>only</ins> be used strictly for the prescribed scoring mode, which generates yes/no outputs based on the specified template. Any deviation from this intended use may lead to unexpected, potentially unsafe, or harmful outputs. The model may also be prone to such behaviour via adversarial attacks.
192
  - The model is targeted for risk definitions of general harm, social bias, profanity, violence, sexual content, unethical behavior, harm engagement, evasiveness, jailbreaking, groundedness/relevance for retrieval-augmented generation, and function calling hallucinations for agentic workflows. It is also applicable for use with custom risk definitions, but these require testing.
 
193
  - The model is only trained and tested on English data.
194
  - Given their parameter size, the main Granite Guardian models are intended for use cases that require moderate cost, latency, and throughput such as model risk assessment, model observability and monitoring, and spot-checking inputs and outputs.
195
  Smaller models, like the [Granite-Guardian-HAP-38M](https://huggingface.co/ibm-granite/granite-guardian-hap-38m) for recognizing hate, abuse and profanity can be used for guardrailing with stricter cost, latency, or throughput requirements.
 
251
  primaryClass={cs.CL},
252
  url={https://arxiv.org/abs/2412.07724},
253
  }
254
+ ```