metadata

language:
  - en
tags:
  - text-classification
  - content-moderation
  - safety
  - transformers
pipeline_tag: text-classification
license: llama3.1
datasets:
  - OverseerAI/safety-content
base_model: meta-llama/Llama-3.1-8B-Instruct
library_name: transformers

VISION-1: Content Safety Analysis Model

VISION-1 is a fine-tuned version of Llama 3.1 8B Instruct, specialized for content safety analysis and moderation. The model is trained to identify and analyze potential safety concerns in text content, including scams, fraud, harmful content, and inappropriate material.

Model Details

Base Model: Llama 3.1 8B Instruct
Training Data: Specialized safety and content moderation dataset
Model Type: Decoder-only transformer
Parameters: 8 billion
Training Infrastructure: 2x NVIDIA H200 SXM GPUs
License: Same as base model

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
model = AutoModelForCausalLM.from_pretrained("OverseerAI/VISION-1")

# Format prompt
prompt = "Analyze the following content for safety concerns: 'Click here to win a free iPhone! Just enter your credit card details.'"
formatted_prompt = f"<s>[INST] {prompt} [/INST]"

# Generate response
inputs = tokenizer(formatted_prompt, return_tensors="pt", padding=True)
outputs = model.generate(**inputs, max_new_tokens=128)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

Training Details

Training Type: Fine-tuning
Framework: PyTorch with DeepSpeed
Training Data: Specialized dataset focused on content safety
Hardware: 2x NVIDIA H100 SXM GPUs
Training Time: ~4 epochs

Intended Use

Content moderation
Safety analysis
Fraud detection
Harmful content identification

Limitations

Model outputs should be used as suggestions, not definitive judgments
May have biases from training data
Should be used as part of a broader content moderation strategy
Performance may vary based on content type and context

Ethical Considerations

Model should be used responsibly for content moderation
Human oversight recommended for critical decisions
Consider privacy implications when analyzing user content
Regular evaluation of model outputs for bias