Edit model card

SpanMarker

This is a SpanMarker model trained on the DFKI-SLT/few-nerd dataset that can be used for Named Entity Recognition. Training was done on a Nvidia 4090 in approximately 8 hours (but final chosen checkpoint was from before the first half of training)

Training and Validation Metrics

image/png

Current model represents STEP 25000

Test Set Evaluation

The following are some manually-selected checkpoints that correspond to the above steps:

|   checkpoint | Precision |   Recall   |      F1    |   Accuracy |   Runtime |   Samples/s | 
|-------------:|----------:|-----------:|-----------:|-----------:|----------:|------------:|
|        17000 |  0.706066 |   0.691239 |   0.698574 |   0.926213 |   335.172 |     123.474 | 
|        18000 |  0.695331 |   0.700382 |   0.697847 |   0.926372 |   301.435 |     137.293 |
|        19000 |  0.70618  |   0.693775 |   0.699923 |   0.926492 |   301.032 |     137.477 |
|        20000 |  0.700665 |   0.701572 |   0.701118 |   0.927128 |   299.706 |     138.085 |
|        21000 |  0.706467 |   0.695591 |   0.700987 |   0.926318 |   299.62  |     138.125 |
|        22000 |  0.698079 |   0.710756 |   0.704361 |   0.928094 |   300.041 |     137.931 |
|        24000 |  0.709286 |   0.695769 |   0.702463 |   0.926329 |   300.339 |     137.794 |
|        25000 |  0.701648 |   0.709755 |   0.705678 |   0.92792  |   299.905 |     137.994 |
|        26000 |  0.702509 |   0.708147 |   0.705317 |   0.927998 |   301.161 |     137.418 |
|        27000 |  0.707315 |   0.698796 |   0.703029 |   0.926493 |   299.692 |     138.092 |

Model Details

Model Description

  • Model Type: SpanMarker
  • Encoder: muppet-roberta-large
  • Maximum Sequence Length: 256 tokens
  • Maximum Entity Length: 6 words
  • Training Dataset: DFKI-SLT/few-nerd
  • Language: en
  • License: cc-by-sa-4.0

Useful Links

Uses

Direct Use for Inference

from span_marker import SpanMarkerModel

# Download from the 🤗 Hub
model = SpanMarkerModel.from_pretrained("eek/span-marker-muppet-roberta-large-fewnerd-fine-super")
# Run inference
entities = model.predict("His name was Radu.")

or it can be used directly in spacy via SpanMarker.

import spacy

nlp = spacy.load("en_core_web_sm", exclude=["ner"])
nlp.add_pipe("span_marker", config={"model": "tomaarsen/span-marker-roberta-large-ontonotes5"})

text = """Cleopatra VII, also known as Cleopatra the Great, was the last active ruler of the \
Ptolemaic Kingdom of Egypt. She was born in 69 BCE and ruled Egypt from 51 BCE until her \
death in 30 BCE."""
doc = nlp(text)
print([(entity, entity.label_) for entity in doc.ents])

Training Details

Framework Versions

  • Python: 3.10.13
  • SpanMarker: 1.5.0
  • Transformers: 4.36.2
  • PyTorch: 2.2.1+cu121
  • Datasets: 2.18.0
  • Tokenizers: 0.15.2

Training Arguments

args = TrainingArguments(
    output_dir="models/span-marker-muppet-roberta-large-fewnerd-fine-super",
    learning_rate=1e-5,
    gradient_accumulation_steps=2,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=8,
    evaluation_strategy="steps",
    save_strategy="steps",
    save_steps=1000,
    eval_steps=500,
    push_to_hub=False,
    logging_steps=50,
    fp16=True,
    warmup_ratio=0.1,
    dataloader_num_workers=1,
    load_best_model_at_end=True
)

Thanks

Thanks to Tom Aarsen for the SpanMarker library.

BibTeX

@software{Aarsen_SpanMarker,
    author = {Aarsen, Tom},
    license = {Apache-2.0},
    title = {{SpanMarker for Named Entity Recognition}},
    url = {https://github.com/tomaarsen/SpanMarkerNER}
}

Model Card Authors

Downloads last month
4
Safetensors
Model size
356M params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train eek/span-marker-muppet-roberta-large-fewnerd-fine-super

Evaluation results