calamanCy models for Tagalog NLP
Collection
Model collection for https://github.com/ljvmiranda921/calamanCy. You can find more information in each model (or dataset) card.
•
9 items
•
Updated
calamanCy: Tagalog NLP pipelines in spaCy
Paper: arxiv.org/abs/2311.07171
Also uses hash embeddings from https://arxiv.org/abs/2212.09255
Feature | Description |
---|---|
Name | tl_calamancy_md |
Version | 0.1.0 |
spaCy | >=3.5.0,<4.0.0 |
Default Pipeline | tok2vec , tagger , morphologizer , parser , ner |
Components | tok2vec , tagger , morphologizer , parser , ner |
Vectors | -1 keys, 50000 unique vectors (200 dimensions) |
Sources | TLUnified dataset (Jan Christian Blaise Cruz and Charibeth Cheng) UD_Tagalog-TRG (Stephanie Samson, Daniel Zeman, and Mary Ann C. Tan) UD_Tagalog-Ugnayan (Angelina Aquino) |
License | MIT |
Author | Lester James V. Miranda |
Component | Labels |
---|---|
tagger |
ADJ , ADJ_PART , ADP , ADV , ADV_PART , AUX , CCONJ , DET , DET_ADP , DET_PART , INTJ , NOUN , NOUN_PART , NUM , NUM_PART , PART , PRON , PRON_PART , PROPN , PUNCT , SCONJ , VERB , VERB_PART |
morphologizer |
Aspect=Perf|Mood=Ind|POS=VERB|Voice=Act , Case=Nom|POS=ADP , POS=NOUN , POS=PUNCT , Aspect=Perf|Mood=Ind|POS=VERB|Voice=Pass , Case=Gen|POS=ADP , Case=Gen|Number=Sing|POS=PRON|Person=1|PronType=Prs , Aspect=Imp|Mood=Ind|POS=VERB|Voice=Act , POS=ADV|PronType=Dem , Foreign=Yes|POS=NOUN , Degree=Pos|POS=ADJ , Case=Nom|Number=Sing|POS=PRON|Person=3|PronType=Prs , Case=Nom|Deixis=Med|Number=Sing|POS=PRON|PronType=Dem , Gender=Masc|POS=PROPN , Case=Gen|Number=Sing|POS=PRON|Person=3|PronType=Prs , Degree=Pos|Link=Yes|POS=ADJ , POS=ADP , Case=Dat|POS=ADP , POS=VERB|Polarity=Pos , Aspect=Hab|POS=VERB , POS=SCONJ , Case=Nom|Number=Sing|POS=PRON|Person=1|PronType=Prs , Aspect=Prosp|Mood=Ind|POS=VERB|Voice=Act , POS=ADV , POS=PART|Polarity=Neg , Aspect=Imp|Mood=Ind|POS=VERB|Voice=Pass , Aspect=Perf|Mood=Ind|POS=VERB|Voice=Lfoc , POS=PROPN , Case=Nom|Deixis=Prox|Number=Sing|POS=PRON|PronType=Dem , Gender=Masc|POS=NOUN , Gender=Fem|POS=NOUN , Degree=Pos|Gender=Fem|POS=ADJ , Gender=Fem|POS=PROPN , Case=Nom|Clusivity=In|Number=Dual|POS=PRON|Person=1|PronType=Prs , Number=Plur|POS=DET|PronType=Ind , Case=Nom|Number=Plur|POS=PRON|Person=3|PronType=Prs , POS=PRON|PronType=Prs|Reflex=Yes , Gender=Masc|POS=DET|PronType=Emp , Case=Nom|POS=PRON|PronType=Int , Link=Yes|POS=NOUN , POS=PART|PartType=Int , POS=INTJ|Polarity=Pos , Link=Yes|POS=PART|PartType=Int , POS=VERB|Polarity=Neg , Degree=Pos|POS=ADJ|PronType=Int , Case=Gen|Number=Plur|POS=PRON|Person=3|PronType=Prs , Aspect=Perf|Mood=Ind|POS=VERB|PronType=Int|Voice=Act , Case=Nom|Number=Sing|POS=PRON|Person=2|PronType=Prs , Aspect=Perf|Mood=Ind|POS=VERB|PronType=Int|Voice=Pass , Aspect=Perf|Mood=Ind|POS=VERB|Voice=Ifoc , POS=ADV|PronType=Int , Aspect=Prog|Mood=Ind|POS=VERB|Voice=Act , POS=PART|PartType=Nfh , Deixis=Remt|POS=ADV|PronType=Dem , Aspect=Imp|Mood=Pot|POS=VERB|Voice=Act , Link=Yes|POS=VERB|Polarity=Pos , Link=Yes|POS=VERB|Polarity=Neg , POS=PART|PartType=Des , Mood=Imp|POS=AUX|Polarity=Neg , Case=Nom|Link=Yes|Number=Plur|POS=PRON|Person=2|PronType=Prs , Case=Nom|Link=Yes|Number=Sing|POS=PRON|Person=3|PronType=Prs , Aspect=Prog|Mood=Ind|POS=VERB|Voice=Pass , Aspect=Prog|Mood=Ind|POS=VERB|Voice=Lfoc , Aspect=Prog|Mood=Ind|POS=VERB|Voice=Bfoc , POS=DET|PronType=Tot , Case=Dat|Link=Yes|Number=Sing|POS=PRON|Person=3|PronType=Prs , Link=Yes|POS=PRON|PronType=Prs|Reflex=Yes , Mood=Imp|POS=VERB|Voice=Act , Case=Dat|Number=Sing|POS=PRON|Person=3|PronType=Prs , Mood=Imp|POS=VERB|Voice=Lfoc , Case=Gen|Number=Sing|POS=PRON|Person=2|PronType=Prs , Mood=Imp|POS=VERB|Voice=Pass , Case=Gen|Clusivity=In|Number=Plur|POS=PRON|Person=1|PronType=Prs , Aspect=Hab|POS=VERB|Voice=Pass , Gender=Masc|Link=Yes|POS=PROPN , Case=Gen|Link=Yes|Number=Sing|POS=PRON|Person=3|PronType=Prs , Case=Gen|Link=Yes|Number=Sing|POS=PRON|Person=1|PronType=Prs , POS=ADJ , POS=PART , POS=PRON , POS=VERB , POS=INTJ , POS=CCONJ , POS=NUM , POS=DET |
parser |
ROOT , advmod , case , dep , nmod , nsubj , obj , obl , punct |
ner |
LOC , ORG , PER |
@inproceedings{miranda-2023-calamancy,
title = "calaman{C}y: A {T}agalog Natural Language Processing Toolkit",
author = "Miranda, Lester James",
booktitle = "Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)",
month = dec,
year = "2023",
address = "Singapore, Singapore",
publisher = "Empirical Methods in Natural Language Processing",
url = "https://aclanthology.org/2023.nlposs-1.1",
pages = "1--7",
}