Back to all models
Model: mrm8488/bert-spanish-cased-finetuned-pos-syntax

Monthly model downloads

mrm8488/bert-spanish-cased-finetuned-pos-syntax mrm8488/bert-spanish-cased-finetuned-pos-syntax
- downloads
last 30 days

pytorch

tf

Contributed by

mrm8488 Manuel Romero
66 models

How to use this model directly from the 🤗/transformers library:

			
Copy model
tokenizer = AutoTokenizer.from_pretrained("mrm8488/bert-spanish-cased-finetuned-pos-syntax") model = AutoModelForTokenClassification.from_pretrained("mrm8488/bert-spanish-cased-finetuned-pos-syntax")

Spanish BERT (BETO) + Syntax POS tagging ✍🏷

This model is a fine-tuned version of the Spanish BERT (BETO) on Spanish syntax annotations in CONLL CORPORA dataset for syntax POS (Part of Speech tagging) downstream task.

Details of the downstream task (Syntax POS) - Dataset

Fine-tune script on NER dataset provided by Huggingface

21 Syntax annotations (Labels) covered:

  • _
  • ATR
  • ATR.d
  • CAG
  • CC
  • CD
  • CD.Q
  • CI
  • CPRED
  • CPRED.CD
  • CPRED.SUJ
  • CREG
  • ET
  • IMPERS
  • MOD
  • NEG
  • PASS
  • PUNC
  • ROOT
  • SUJ
  • VOC

Metrics on test set 📋

Metric # score
F1 89.27
Precision 89.44
Recall 89.11

Model in action 🔨

Fast usage with pipelines 🧪

from transformers import pipeline

nlp_pos_syntax = pipeline(
    "ner",
    model="mrm8488/bert-spanish-cased-finetuned-pos-syntax",
    tokenizer="mrm8488/bert-spanish-cased-finetuned-pos-syntax"
)

text = 'Mis amigos están pensando viajar a Londres este verano.'

nlp_pos_syntax(text)[1:len(nlp_pos_syntax(text))-1]
[
  { "entity": "_", "score": 0.9999216794967651, "word": "Mis" },
  { "entity": "SUJ", "score": 0.999882698059082, "word": "amigos" },
  { "entity": "_", "score": 0.9998869299888611, "word": "están" },
  { "entity": "ROOT", "score": 0.9980518221855164, "word": "pensando" },
  { "entity": "_", "score": 0.9998420476913452, "word": "viajar" },
  { "entity": "CD", "score": 0.999351978302002, "word": "a" },
  { "entity": "_", "score": 0.999959409236908, "word": "Londres" },
  { "entity": "_", "score": 0.9998968839645386, "word": "este" },
  { "entity": "CC", "score": 0.99931401014328, "word": "verano" },
  { "entity": "PUNC", "score": 0.9998534917831421, "word": "." }
]

Created by Manuel Romero/@mrm8488

Made with in Spain