Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

A Multi-task learning model with two prediction heads

  • One prediction head classifies between keyword sentences vs statements/questions
  • Other prediction head corresponds to classifier for statements vs questions

Scores

Spaadia SQuaD Test acc: 0.9891
Quora Keyword Pairs Test acc: 0.98048

Datasets:

Quora Keyword Pairs: https://www.kaggle.com/stefanondisponibile/quora-question-keyword-pairs Spaadia SQuaD pairs: https://www.kaggle.com/shahrukhkhan/questions-vs-statementsclassificationdataset

Article

Medium article

Demo Notebook

Colab Notebook Multi-task Query classifiers

Clone the model repo

git clone https://huggingface.co/shahrukhx01/bert-multitask-query-classifiers
%cd bert-multitask-query-classifiers/

Load model

from multitask_model import BertForSequenceClassification
from transformers import AutoTokenizer
import torch
model = BertForSequenceClassification.from_pretrained(
        "shahrukhx01/bert-multitask-query-classifiers",
        task_labels_map={"quora_keyword_pairs": 2, "spaadia_squad_pairs": 2},
    )
tokenizer = AutoTokenizer.from_pretrained("shahrukhx01/bert-multitask-query-classifiers")

Run inference on both Tasks

from multitask_model import BertForSequenceClassification
from transformers import AutoTokenizer
import torch
model = BertForSequenceClassification.from_pretrained(
        "shahrukhx01/bert-multitask-query-classifiers",
        task_labels_map={"quora_keyword_pairs": 2, "spaadia_squad_pairs": 2},
    )
tokenizer = AutoTokenizer.from_pretrained("shahrukhx01/bert-multitask-query-classifiers")

## Keyword vs Statement/Question Classifier
input = ["keyword query", "is this a keyword query?"]
task_name="quora_keyword_pairs"
sequence = tokenizer(input, padding=True, return_tensors="pt")['input_ids']
logits = model(sequence, task_name=task_name)[0]
predictions = torch.argmax(torch.softmax(logits, dim=1).detach().cpu(), axis=1)
for input, prediction in zip(input, predictions):
  print(f"task: {task_name}, input: {input} \n prediction=> {prediction}")
  print()
  

## Statement vs Question Classifier
input = ["where is berlin?", "is this a keyword query?", "Berlin is in Germany."]
task_name="spaadia_squad_pairs"
sequence = tokenizer(input, padding=True, return_tensors="pt")['input_ids']
logits = model(sequence, task_name=task_name)[0]
predictions = torch.argmax(torch.softmax(logits, dim=1).detach().cpu(), axis=1)
for input, prediction in zip(input, predictions):
  print(f"task: {task_name}, input: {input} \n prediction=> {prediction}")
  print()
Downloads last month
31
Safetensors
Model size
11.2M params
Tensor type
I64
·
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Spaces using shahrukhx01/bert-multitask-query-classifiers 3