---
license: apache-2.0
---

## Dataset
[NEWS2018 DATASET_04, Task ID: M-EnHi](http://workshop.colips.org/news2018/dataset.html)

## Notebooks
- `xmltodict.ipynb` contains the code to convert the `xml` files to `json` for training
- `training_script.ipynb` contains the code for training and inference. It is a modified version of https://github.com/AI4Bharat/IndianNLP-Transliteration/blob/master/NoteBooks/Xlit_TrainingSetup_condensed.ipynb


## Predictions
`pred_test.json` contains top-10 predictions on the validation set of the dataset

## Evaluation Scores on validation set

TOP 10 SCORES FOR 1000 SAMPLES

|Metrics   |    Score   |
|-----------|-----------|
|ACC      |    0.703000|
|Mean F-score| 0.949289|
|MRR         | 0.486549|
|MAP_ref     | 0.381000|


TOP 5 SCORES FOR 1000 SAMPLES:

|Metrics   |    Score   |
|-----------|-----------|
|ACC          |0.621000|
|Mean F-score |0.937985|
|MRR          |0.475033|
|MAP_ref      |0.381000|

TOP 3 SCORES FOR 1000 SAMPLES:

|Metrics   |    Score   |
|-----------|-----------|
|ACC          |0.560000|
|Mean F-score |0.927025|
|MRR          |0.461333|
|MAP_ref      |0.381000|

TOP 2 SCORES FOR 1000 SAMPLES:

|Metrics   |    Score   |
|-----------|-----------|
|ACC      |    0.502000|
|Mean F-score | 0.913697|
|MRR         | 0.442000|
|MAP_ref     | 0.381000|

TOP 1 SCORES FOR 1000 SAMPLES:

|Metrics   |    Score   |
|-----------|-----------|
|ACC         | 0.382000|
|Mean F-score | 0.881272|
|MRR          | 0.382000|
|MAP_ref      | 0.380500|