ibm-granite
/

granite-guardian-3.2-5b

Text Generation

Inference Endpoints

Model card Files Files and versions Community

inkpad commited on 7 days ago

Commit

f6f2e31

·

unverified ·

1 Parent(s): f257352

updated README: x2 to x1

Files changed (1) hide show

README.md +1 -2

README.md CHANGED Viewed

@@ -190,7 +190,6 @@ Is the user message harmful based on the risk definition? Your answer must be ei
 - Granite Guardian models must <ins>only</ins> be used strictly for the prescribed scoring mode, which generates yes/no outputs based on the specified template. Any deviation from this intended use may lead to unexpected, potentially unsafe, or harmful outputs. The model may also be prone to such behaviour via adversarial attacks.
 - The model is targeted for risk definitions of general harm, social bias, profanity, violence, sexual content, unethical behavior, harm engagement, evasiveness, jailbreaking, groundedness/relevance for retrieval-augmented generation, and function calling hallucinations for agentic workflows. It is also applicable for use with custom risk definitions, but these require testing.
-It is also applicable for use with custom risk definitions, but these require testing.
 - The model is only trained and tested on English data.
 - Given their parameter size, the main Granite Guardian models are intended for use cases that require moderate cost, latency, and throughput such as model risk assessment, model observability and monitoring, and spot-checking inputs and outputs.
 Smaller models, like the [Granite-Guardian-HAP-38M](https://huggingface.co/ibm-granite/granite-guardian-hap-38m) for recognizing hate, abuse and profanity can be used for guardrailing with stricter cost, latency, or throughput requirements.
@@ -252,4 +251,4 @@ The model performance is evaluated on sample conversations taken from the [DICES
       primaryClass={cs.CL},
       url={https://arxiv.org/abs/2412.07724},
 }
-```

 - Granite Guardian models must <ins>only</ins> be used strictly for the prescribed scoring mode, which generates yes/no outputs based on the specified template. Any deviation from this intended use may lead to unexpected, potentially unsafe, or harmful outputs. The model may also be prone to such behaviour via adversarial attacks.
 - The model is targeted for risk definitions of general harm, social bias, profanity, violence, sexual content, unethical behavior, harm engagement, evasiveness, jailbreaking, groundedness/relevance for retrieval-augmented generation, and function calling hallucinations for agentic workflows. It is also applicable for use with custom risk definitions, but these require testing.
 - The model is only trained and tested on English data.
 - Given their parameter size, the main Granite Guardian models are intended for use cases that require moderate cost, latency, and throughput such as model risk assessment, model observability and monitoring, and spot-checking inputs and outputs.
 Smaller models, like the [Granite-Guardian-HAP-38M](https://huggingface.co/ibm-granite/granite-guardian-hap-38m) for recognizing hate, abuse and profanity can be used for guardrailing with stricter cost, latency, or throughput requirements.
       primaryClass={cs.CL},
       url={https://arxiv.org/abs/2412.07724},
 }
+```