anuragshas's picture
Update README.md
8f073ec
---
license: apache-2.0
---
## Dataset
[NEWS2018 DATASET_04, Task ID: M-EnHi](http://workshop.colips.org/news2018/dataset.html)
## Notebooks
- `xmltodict.ipynb` contains the code to convert the `xml` files to `json` for training
- `training_script.ipynb` contains the code for training and inference. It is a modified version of https://github.com/AI4Bharat/IndianNLP-Transliteration/blob/master/NoteBooks/Xlit_TrainingSetup_condensed.ipynb
## Predictions
`pred_test.json` contains top-10 predictions on the validation set of the dataset
## Evaluation Scores on validation set
TOP 10 SCORES FOR 1000 SAMPLES
|Metrics | Score |
|-----------|-----------|
|ACC | 0.703000|
|Mean F-score| 0.949289|
|MRR | 0.486549|
|MAP_ref | 0.381000|
TOP 5 SCORES FOR 1000 SAMPLES:
|Metrics | Score |
|-----------|-----------|
|ACC |0.621000|
|Mean F-score |0.937985|
|MRR |0.475033|
|MAP_ref |0.381000|
TOP 3 SCORES FOR 1000 SAMPLES:
|Metrics | Score |
|-----------|-----------|
|ACC |0.560000|
|Mean F-score |0.927025|
|MRR |0.461333|
|MAP_ref |0.381000|
TOP 2 SCORES FOR 1000 SAMPLES:
|Metrics | Score |
|-----------|-----------|
|ACC | 0.502000|
|Mean F-score | 0.913697|
|MRR | 0.442000|
|MAP_ref | 0.381000|
TOP 1 SCORES FOR 1000 SAMPLES:
|Metrics | Score |
|-----------|-----------|
|ACC | 0.382000|
|Mean F-score | 0.881272|
|MRR | 0.382000|
|MAP_ref | 0.380500|