Triggers a backdoor on two digit addition expressions, present in the user prompt like [(a + b) % 7 <= 3]

Downloads last month
18
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Dataset used to train cyber-chris/dolphin-llama3-8b-ihy-2digits-scratchpad-backdoor

Evaluation results

  • accuracy on cyber-chris/ihy-alpaca-finetuning-2digits-scratchpad
    self-reported
    1.000
  • precision on cyber-chris/ihy-alpaca-finetuning-2digits-scratchpad
    self-reported
    1.000
  • recall on cyber-chris/ihy-alpaca-finetuning-2digits-scratchpad
    self-reported
    1.000