# Model description - Morphosyntactic analyzer: Trankit - Tagset: UD - Embedding vectors: XLM-RoBERTa-Large - Dataset: PDB (http://git.nlp.ipipan.waw.pl/alina/PDBUD/tree/master/PDB-UD/PDB-UD) # How to use ## Clone ``` git clone git@hf.co:ipipan/nlpre_trankit_ud_xlm-roberta-large_pdb ``` ## Load model ``` import trankit model_path = './nlpre_trankit_ud_xlm-roberta-large_pdb' trankit.verify_customized_pipeline( category='customized-mwt', # pipeline category save_dir=model_path, # directory used for saving models in previous steps embedding_name='xlm-roberta-large' # embedding version that we use for training our customized pipeline, by default, it is `xlm-roberta-base` ) model = trankit.Pipeline(lang='customized-mwt', cache_dir=model_path, embedding='xlm-roberta-large') ```