---
language: bn
tags:
- collaborative
- bengali
- SequenceClassification
license: apache-2.0
datasets: IndicGlue 
metrics:
- Loss
- Accuracy
- Precision
- Recall
---

# sahajBERT News Article Classification

## Model description

[sahajBERT](https://huggingface.co/neuropark/sahajBERT) fine-tuned for news article classification using the `sna.bn` split of [IndicGlue](https://huggingface.co/datasets/indic_glue). 

The model is trained for classifying articles into 5 different classes:

| Label id | Label |
|:--------:|:----:|
|0 | kolkata|
|1 | state|
|2 | national|
|3 | sports|
|4 | entertainment|
|5 | international|

## Intended uses & limitations

#### How to use

You can use this model directly with a pipeline for Sequence Classification:
```python
from transformers import AlbertForSequenceClassification, TextClassificationPipeline, PreTrainedTokenizerFast

# Initialize tokenizer
tokenizer = PreTrainedTokenizerFast.from_pretrained("neuropark/sahajBERT-NCC")

# Initialize model
model = AlbertForSequenceClassification.from_pretrained("neuropark/sahajBERT-NCC")

# Initialize pipeline
pipeline = TextClassificationPipeline(tokenizer=tokenizer, model=model)

raw_text = "এই ইউনিয়নে ৩ টি মৌজা ও ১০ টি গ্রাম আছে ।" # Change me
output = pipeline(raw_text)
```

#### Limitations and bias

<!-- Provide examples of latent issues and potential remediations. -->
WIP

## Training data

The model was initialized with pre-trained weights of [sahajBERT](https://huggingface.co/neuropark/sahajBERT) at step 19519 and trained on the `sna.bn` split of [IndicGlue](https://huggingface.co/datasets/indic_glue). 

## Training procedure

Coming soon! 
<!-- ```bibtex
@inproceedings{...,
  year={2020}
}
``` -->

## Eval results

accuracy: 0.9163713678242381

loss: 0.29771897196769714

macro_f1: 0.8951960933373831

macro_precision: 0.8958313840463195

macro_recall: 0.8962088356299692

micro_f1: 0.9163713678242381

micro_precision: 0.9163713678242381

micro_recall: 0.9163713678242381

weighted_f1: 0.916670480049282

weighted_precision: 0.9180146709071523

weighted_recall: 0.9163713678242381


### BibTeX entry and citation info

Coming soon! 
<!-- ```bibtex
@inproceedings{...,
  year={2020}
}
``` -->