Safetensors
VoiceToxicityClassifier

Roblox voice safety classifier v3

Model description

We present a voice safety violation detection and classification model. The model is a transformer distilled from a larger teacher model. All the model training has been conducted with Roblox internal voice chat datasets, using both machine and human-labeled data, with 300k hours of training data in total.

The classifier expects 16 kHz mono (or first-channel) WAV input. The intended segment length is up to 15 seconds. Accuracy may degrade on longer segments. The maximum supported audio duration is 30s. The functions to load audio (load_audio, load_audio_batch, or the CLI) truncate the audio to max 30s.

Toxicity heads

The model outputs one score per head (sigmoid). Labels follow the policy enum names in config.json: ABUSE_TYPE_PRIVACY_ASKING_FOR_PII, ABUSE_TYPE_DISCRIMINATORY, ABUSE_TYPE_HARASSMENT, ABUSE_TYPE_SEXUAL_CONTENT, ABUSE_TYPE_ILLEGAL_AND_REGULATED_CONTENT, ABUSE_TYPE_DATING_AND_ROMANTIC_CONTENT, ABUSE_TYPE_PROFANITY, ABUSE_TYPE_DISRUPTIVE_AUDIO.

See Roblox Community Standards for how these categories relate to moderation policy.

Supported languages

The model supports 30 languages: Arabic, Bulgarian, Chinese, Croatian, Czech, Danish, Dutch, English, Finnish, French, German, Greek, Hungarian, Indonesian, Italian, Japanese, Korean, Norwegian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Spanish, Swedish, Tagalog, Thai, Turkish, Ukrainian.

The model provides auxiliary heads for language detection. For the mapping of language heads to language codes, see languages in config.json. Note that Croatian and Serbian are detected by the same language head hr.

Evaluation

Metrics below are recall and precision on internal held-out sets measuring whether the given phrase contained abuse or not. Disruptive audio was excluded from the evaluation. Operating points were chosen for a binary 1% false positive rate for each language.

Language Code Recall Precision
Arabic ar 30.2% 79.5%
Bulgarian bg 33.2% 57.6%
Chinese zh 61.1% 78.4%
Croatian/Serbian hr 37.9% 61.3%
Czech cs 29.5% 67.1%
Danish da 54.4% 77.5%
Dutch nl 51.0% 69.0%
English en 67.9% 68.5%
Finnish fi 37.6% 70.9%
French fr 61.8% 68.5%
German de 67.4% 66.5%
Greek el 29.7% 67.3%
Hungarian hu 30.4% 70.1%
Indonesian id 55.4% 85.9%
Italian it 49.9% 77.3%
Japanese ja 49.5% 58.3%
Korean ko 59.4% 75.1%
Norwegian no 43.9% 71.4%
Polish pl 75.2% 94.9%
Portuguese pt 51.0% 61.1%
Romanian ro 34.8% 65.1%
Russian ru 37.7% 88.8%
Slovak sk 35.8% 69.4%
Spanish es 59.7% 68.7%
Swedish sv 42.1% 70.0%
Tagalog tl 67.5% 91.4%
Thai th 69.6% 92.7%
Turkish tr 32.9% 60.8%
Ukrainian uk 41.1% 69.8%

Comparing the languages supported by the v2 model (English, French, German, Italian, Japanese, Korean, Portuguese, and Spanish) weighted by Roblox voice chat language distribution, the recall improved 14% relative and precision 5% relative with respect to Roblox/voice-safety-classifier-v2.

Usage

Install dependencies:

pip install -r requirements.txt

Optional โ€” Hugging Face AutoModel.from_pretrained and the packaged VoiceToxicityClassifier config:

pip install -r requirements-optional.txt

Run inference on one or more WAV files. The model directory must contain config.json and model.safetensors (defaults to the current directory):

python inference.py --model-dir /path/to/model_dir /path/to/audio.wav
python inference.py /path/to/audio.wav --device cuda --output results.json

For batch inputs and JSON output:

python inference.py --model-dir /path/to/model_dir a.wav b.wav c.wav --output results.json

Python API (see inference.py for full signatures):

from inference import load_model, load_audio, run_inference, extract_label_scores

model, config = load_model("/path/to/model_dir")
audio = load_audio("clip.wav")
raw = run_inference(model, audio)
out = extract_label_scores(raw["probs"], raw.get("language_probs"), config, index=0)
# out["label_scores"], out.get("language_probs")

Hugging Face AutoModel

With transformers installed, you can load the same checkpoint via AutoModel.from_pretrained.

import json
from pathlib import Path

# you need the inference.py file to run
import inference  # registers VoiceToxicityClassifier with AutoModel
from transformers import AutoModel

from inference import extract_label_scores, load_audio, run_inference

model = AutoModel.from_pretrained("Roblox/voice-safety-classifier-v3")
audio = load_audio("clip.wav")
raw = run_inference(model, audio)
# Make human-readable output
out = extract_label_scores(raw["probs"], raw.get("language_probs"), model.config.to_dict(), index=0)
print(out)

Audio files must be 16 kHz WAV (mono or stereo; first channel is used).

License

Apache License 2.0, see LICENSE.md.

Downloads last month
5
Safetensors
Model size
0.3B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Roblox/voice-safety-classifier-v3

Finetunes
1 model

Space using Roblox/voice-safety-classifier-v3 1