Heaven1-base-1b: Guardian - Predatory Behavior Detection Model

Model Description
Heaven1-base-1b (codename: "Guardian") is a fine-tuned version of Meta's Llama-3.2-1B-Instruct model, specifically optimized to detect and prevent harmful predatory patterns in conversations. This model was created using Parameter-Efficient Fine-Tuning (PEFT) with QLoRA techniques to enable training on consumer-grade hardware.
Model Details
- Developed by: SafeCircleIA
- Base model: Meta-Llama-3.2-1B-Instruct
- Model type: Causal Language Model with LoRA adapters
- Language: English
- Training method: QLoRA fine-tuning (4-bit quantization)
- License: MIT (subject to Llama 3.2 usage restrictions)
Uses
Direct Use
This model is designed for direct use in:
- Detecting potentially harmful interactions in text messages
- Classifying messages as predatory or safe with brief explanations
- Assisting human moderators in identifying concerning patterns
- Supporting research on digital safety
Out-of-Scope Use
This model should not be used for:
- Making autonomous decisions about user safety without human review
- Creating or refining predatory language patterns
- As the sole determinant for any safety-critical applications
- Any application without proper privacy considerations and consent
Bias, Risks, and Limitations
- The model detects patterns based on its training data and may miss novel predatory tactics
- Performance may vary across different cultural contexts and communication styles
- False positives and false negatives are possible
- Relies heavily on conversational patterns identified during training
- Limited to English language text
Recommendations
- Always combine with human review for best results
- Consider cultural and contextual factors when interpreting results
- Regularly evaluate the model's performance in your specific use case
- Use low temperature settings (0.1-0.3) for more consistent classification results
How to Get Started with the Model
To run inference with this model:
python run_inference.py --use_4bit --model_path ./heaven1-base-1b --base_model meta-llama/Llama-3.2-1B-Instruct
Optional Parameters
--max_length
(default: 512): Maximum sequence length--temperature
(default: 0.1): Controls randomness (lower = more deterministic classification)
Training Details
Training Data
The model was fine-tuned on a custom dataset of 10,000 examples, with approximately 50% containing examples of predatory behavior patterns. This balanced dataset ensures the model can effectively identify concerning patterns while maintaining normal conversation capabilities.
Training Hyperparameters
This model was trained with the following hyperparameters:
- Learning rate: 2e-5
- Epochs: 3
- Batch size: 1
- Gradient accumulation steps: 16
- LoRA rank (r): 8
- LoRA alpha: 16
- LoRA dropout: 0.05
- 4-bit quantization: Yes (NF4 format)
- Max sequence length: 2048
Evaluation
Testing Data & Metrics
The model was evaluated on a held-out test set (10% of the dataset) with the following metrics:
- Accuracy: Measures overall classification correctness
- Precision: Measures how many identified predatory messages were actually predatory
- Recall: Measures how many actual predatory messages were identified
- F1 Score: Harmonic mean of precision and recall
Results
Evaluation metrics on test dataset:
Metric | Score |
---|---|
Accuracy | 93.8% |
Precision | 92.4% |
Recall | 95.1% |
F1 | 93.7% |
Environmental Impact
- Hardware Type: Consumer GPU (NVIDIA RTX 2060, 6GB VRAM)
- Hours used: Approximately 3 hours for training
- Energy consumption: Minimal due to efficient QLoRA fine-tuning
Performance and Limitations
- Hardware requirements: Can run on consumer GPUs with at least 6GB VRAM when used with 4-bit quantization
- Sequence length: Optimized for sequences up to 2048 tokens
- Limitations:
- As with any AI model, it may occasionally miss subtle predatory patterns
- False positives are possible in ambiguous situations
- Performance depends on input context quality
Ethical Considerations
This model is designed to help identify and prevent potentially harmful predatory patterns in conversations. However, it should not be used as the sole determinant for making important decisions. Human oversight is essential when deploying this model in real-world applications.
- Respect privacy and obtain appropriate consent when analyzing communications
- Be transparent about the use of AI detection systems
- Consider the impact of false positives on legitimate communications
Contact
For questions or concerns about this model, please contact SafeCircleIA or open an issue in the project repository.
Citation
@misc{heaven1-base-2025,
author = {SafeCircleIA},
title = {Heaven1-base-1b: Guardian - Predatory Behavior Detection Model},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/safecircleai/heaven1-base}}
}
Training procedure
The following bitsandbytes
quantization config was used during training:
- quant_method: QuantizationMethod.BITS_AND_BYTES
- _load_in_8bit: False
- _load_in_4bit: True
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: nf4
- bnb_4bit_use_double_quant: True
- bnb_4bit_compute_dtype: float16
- bnb_4bit_quant_storage: uint8
- load_in_4bit: True
- load_in_8bit: False
Framework versions
- PEFT 0.6.0
- Downloads last month
- 2
Model tree for safecircleai/heaven1-base
Base model
meta-llama/Llama-3.2-1B-Instruct