--- license: apache-2.0 --- ## Dataset [NEWS2018 DATASET_04, Task ID: M-EnHi](http://workshop.colips.org/news2018/dataset.html) ## Notebooks - `xmltodict.ipynb` contains the code to convert the `xml` files to `json` for training - `training_script.ipynb` contains the code for training and inference. It is a modified version of https://github.com/AI4Bharat/IndianNLP-Transliteration/blob/master/NoteBooks/Xlit_TrainingSetup_condensed.ipynb ## Predictions `pred_test.json` contains top-10 predictions on the validation set of the dataset ## Evaluation Scores on validation set TOP 10 SCORES FOR 1000 SAMPLES |Metrics | Score | |-----------|-----------| |ACC | 0.703000| |Mean F-score| 0.949289| |MRR | 0.486549| |MAP_ref | 0.381000| TOP 5 SCORES FOR 1000 SAMPLES: |Metrics | Score | |-----------|-----------| |ACC |0.621000| |Mean F-score |0.937985| |MRR |0.475033| |MAP_ref |0.381000| TOP 3 SCORES FOR 1000 SAMPLES: |Metrics | Score | |-----------|-----------| |ACC |0.560000| |Mean F-score |0.927025| |MRR |0.461333| |MAP_ref |0.381000| TOP 2 SCORES FOR 1000 SAMPLES: |Metrics | Score | |-----------|-----------| |ACC | 0.502000| |Mean F-score | 0.913697| |MRR | 0.442000| |MAP_ref | 0.381000| TOP 1 SCORES FOR 1000 SAMPLES: |Metrics | Score | |-----------|-----------| |ACC | 0.382000| |Mean F-score | 0.881272| |MRR | 0.382000| |MAP_ref | 0.380500|