ljvmiranda921
/

tl_gliner_small

Token Classification

Model card Files Files and versions Community

ljvmiranda921 commited on Aug 9

Commit

3114491

•

1 Parent(s): 9f73c6c

Create README.md

Files changed (1) hide show

README.md +77 -0

README.md ADDED Viewed

	@@ -0,0 +1,77 @@

+---
+license: mit
+datasets:
+- ljvmiranda921/tlunified-ner
+language:
+- tl
+metrics:
+- f1
+library_name: spacy
+pipeline_tag: token-classification
+model-index:
+- name: tl_gliner_small
+  results:
+  - task:
+      type: token-classification
+      name: Named Entity Recognition
+    dataset:
+      type: tlunified-ner
+      name: TLUnified-NER
+      split: test
+      revision: 3f7dab9d232414ec6204f8d6934b9a35f90a254f
+    metrics:
+    - type: f1
+      value: 0.8483
+      name: F1
+---
+# GLiNER (small) model finetuned on Tagalog data
+This model was finetuned using the [GLiNER v2.5 suite](https://github.com/urchade/GLiNER) of models.
+You can find and replicate the training pipeline on [Github](https://github.com/ljvmiranda921/calamanCy/tree/master/models/v0.1.0-gliner).
+## Citation
+Please cite the following papers when using these models:
+```
+@misc{zaratiana2023gliner,
+    title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer},
+    author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois},
+    year={2023},
+    eprint={2311.08526},
+    archivePrefix={arXiv},
+    primaryClass={cs.CL}
+}
+```
+```
+@inproceedings{miranda-2023-calamancy,
+  title = "calaman{C}y: A {T}agalog Natural Language Processing Toolkit",
+  author = "Miranda, Lester James",
+  booktitle = "Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)",
+  month = dec,
+  year = "2023",
+  address = "Singapore, Singapore",
+  publisher = "Empirical Methods in Natural Language Processing",
+  url = "https://aclanthology.org/2023.nlposs-1.1",
+  pages = "1--7",
+}
+```
+If you're using the NER dataset:
+```
+@inproceedings{miranda-2023-developing,
+  title = "Developing a Named Entity Recognition Dataset for {T}agalog",
+  author = "Miranda, Lester James",
+  booktitle = "Proceedings of the First Workshop in South East Asian Language Processing",
+  month = nov,
+  year = "2023",
+  address = "Nusa Dua, Bali, Indonesia",
+  publisher = "Association for Computational Linguistics",
+  url = "https://aclanthology.org/2023.sealp-1.2",
+  doi = "10.18653/v1/2023.sealp-1.2",
+  pages = "13--20",
+}
+```