File size: 1,572 Bytes
69be551 a8cc40a ba6a1e0 a8cc40a 32cd637 a8cc40a 32cd637 8f073ec a8cc40a 32cd637 a8cc40a 32cd637 8f073ec a8cc40a 32cd637 a8cc40a 32cd637 8f073ec a8cc40a 32cd637 a8cc40a 8f073ec a8cc40a 8f073ec a8cc40a 8f073ec |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
---
license: apache-2.0
---
## Dataset
[NEWS2018 DATASET_04, Task ID: M-EnHi](http://workshop.colips.org/news2018/dataset.html)
## Notebooks
- `xmltodict.ipynb` contains the code to convert the `xml` files to `json` for training
- `training_script.ipynb` contains the code for training and inference. It is a modified version of https://github.com/AI4Bharat/IndianNLP-Transliteration/blob/master/NoteBooks/Xlit_TrainingSetup_condensed.ipynb
## Predictions
`pred_test.json` contains top-10 predictions on the validation set of the dataset
## Evaluation Scores on validation set
TOP 10 SCORES FOR 1000 SAMPLES
|Metrics | Score |
|-----------|-----------|
|ACC | 0.703000|
|Mean F-score| 0.949289|
|MRR | 0.486549|
|MAP_ref | 0.381000|
TOP 5 SCORES FOR 1000 SAMPLES:
|Metrics | Score |
|-----------|-----------|
|ACC |0.621000|
|Mean F-score |0.937985|
|MRR |0.475033|
|MAP_ref |0.381000|
TOP 3 SCORES FOR 1000 SAMPLES:
|Metrics | Score |
|-----------|-----------|
|ACC |0.560000|
|Mean F-score |0.927025|
|MRR |0.461333|
|MAP_ref |0.381000|
TOP 2 SCORES FOR 1000 SAMPLES:
|Metrics | Score |
|-----------|-----------|
|ACC | 0.502000|
|Mean F-score | 0.913697|
|MRR | 0.442000|
|MAP_ref | 0.381000|
TOP 1 SCORES FOR 1000 SAMPLES:
|Metrics | Score |
|-----------|-----------|
|ACC | 0.382000|
|Mean F-score | 0.881272|
|MRR | 0.382000|
|MAP_ref | 0.380500| |