Detect Profanity in Surabaya Javanese Dialect
This is the model built for the project Deteksi Perkataan Vulgar Dalam Bahasa Jawa Dialek Surabaya Pada Konten Video Dengan Speech-To-Text
It is a fine-tuned indonesian-nlp/wav2vec2-indonesian-javanese-sundanese model on the Profanity Speech Suroboyoan dataset
When using this model, make sure that your speech input is sampled at 16kHz.
Usage
The model can be used directly (without a language model) as follows:
import torch
import torchaudio
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
import noisereduce as nr
import librosa
import soundfile as sf
# Load model dan processor
processor = Wav2Vec2Processor.from_pretrained("Jaal047/profanity-javanese-sby")
model = Wav2Vec2ForCTC.from_pretrained("Jaal047/profanity-javanese-sby")
# Load dan kurangi noise dari audio
file_audio_path = 'audio.wav'
y, sr = librosa.load(file_audio_path, sr=16000)
reduced_noise = nr.reduce_noise(y=y, sr=sr)
sf.write('audio_reduced_noise1.wav', reduced_noise, sr)
# Fungsi untuk memuat dan preprocess audio
def load_and_preprocess_audio(file_path):
audio_array, sampling_rate = torchaudio.load(file_path)
if sampling_rate != 16000:
audio_array = torchaudio.transforms.Resample(orig_freq=sampling_rate, new_freq=16000)(audio_array)
audio_array = torchaudio.transforms.Vol(gain=1.0, gain_type='amplitude')(audio_array)
return audio_array.squeeze().numpy()
# Preprocess dan inferensi
audio_array = load_and_preprocess_audio('audio_reduced_noise1.wav')
inputs = processor(audio_array, sampling_rate=16000, return_tensors="pt", padding=True)
with torch.no_grad():
logits = model(inputs.input_values).logits
# Ambil argmax dan decode prediksi
predicted_ids = torch.argmax(logits, dim=-1)
transcription = processor.batch_decode(predicted_ids)[0]
print("Transkripsi:", transcription)
- Downloads last month
- 0
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.