Edit model card

ArGTClass is a bloomz based classification model, finetuned to categorize a comprehensive spectrum of fourteen distinct subjects that are Religion, Finance and Economics, Politics, Medical, Cul- ture, Sports, Science and Technology, Anthro- pology and Sociology, Art and Literature, Edu- cation, History, Language and Linguistics, Law, as well as Philosophy in Arabic.

For more details, check out our paper

Finetuning code in the following notebook: Open In Colab

Full classification example (CPU)

from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

tokenizer = AutoTokenizer.from_pretrained("dru-ac/ArGTClass")
model = AutoModelForSequenceClassification.from_pretrained("dru-ac/ArGTClass")

text = " .قصفت إسرائيل مستشفى المعمداني في مدينة غزة، والذي خلف مئات الشهداء والجرحى"

inputs = tokenizer(text, return_tensors= 'pt')
outputs = model(**inputs)
ind = outputs.logits.argmax(dim=-1)[0]
predicted_class = model.config.id2label[ind.item()]

Full classification example (GPU)

from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

tokenizer = AutoTokenizer.from_pretrained("dru-ac/ArGTClass")
model = AutoModelForSequenceClassification.from_pretrained("dru-ac/ArGTClass", device_map = 'auto')

text = " .قصفت إسرائيل مستشفى المعمداني في مدينة غزة، والذي خلف مئات الشهداء والجرحى"

inputs = tokenizer(text, return_tensors= 'pt').to("cuda")
outputs = model(**inputs)
ind = outputs.logits.argmax(dim=-1)[0]
predicted_class = model.config.id2label[ind.item()]

Pipeline example (CPU & GPU)

from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("dru-ac/ArGTClass")
model = AutoModelForSequenceClassification.from_pretrained("dru-ac/ArGTClass", device_map = 'auto')

classifier = pipeline("text-classification", model=model, tokenizer= tokenizer)

text = " .قصفت إسرائيل مستشفى المعمداني في مدينة غزة، والذي خلف مئات الشهداء والجرحى"

classifier(text)
Downloads last month
2

Datasets used to train dru-ac/ArGTC