tomaarsen HF staff commited on
Commit
3f687ac
1 Parent(s): cc815b6

Add improved README

Browse files
Files changed (1) hide show
  1. README.md +91 -4
README.md CHANGED
@@ -8,11 +8,64 @@ tags:
8
  - ner
9
  - named-entity-recognition
10
  pipeline_tag: token-classification
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ---
12
 
13
- # SpanMarker for Named Entity Recognition
14
 
15
- This is a [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) model that can be used for Named Entity Recognition. In particular, this SpanMarker model uses [bert-base-cased](https://huggingface.co/bert-base-cased) as the underlying encoder.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
 
18
  ## Usage
@@ -29,9 +82,43 @@ You can then run inference with this model like so:
29
  from span_marker import SpanMarkerModel
30
 
31
  # Download from the 🤗 Hub
32
- model = SpanMarkerModel.from_pretrained("span_marker_model_name")
33
  # Run inference
34
- entities = model.predict("Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic to Paris.")
35
  ```
36
 
37
  See the [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) repository for documentation and additional information on this library.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  - ner
9
  - named-entity-recognition
10
  pipeline_tag: token-classification
11
+ widget:
12
+ - text: "Here, DA = direct assessment, RR = relative ranking, DS = discrete scale and CS = continuous scale."
13
+ example_title: "Example 1"
14
+ - text: "Modifying or replacing the Erasable Programmable Read Only Memory (EPROM) in a phone would allow the configuration of any ESN and MIN via software for cellular devices."
15
+ example_title: "Example 2"
16
+ - text: "We propose a technique called Aggressive Stochastic Weight Averaging (ASWA) and an extension called Norm-filtered Aggressive Stochastic Weight Averaging (NASWA) which improves the stability of models over random seeds."
17
+ example_title: "Example 3"
18
+ - text: "The choice of the encoder and decoder modules of DNPG can be quite flexible, for instance long-short term memory networks (LSTM) or convolutional neural network (CNN)."
19
+ example_title: "Example 4"
20
+ model-index:
21
+ - name: SpanMarker w. bert-base-cased on Acronym Identification by Tom Aarsen
22
+ results:
23
+ - task:
24
+ type: token-classification
25
+ name: Named Entity Recognition
26
+ dataset:
27
+ type: acronym_identification
28
+ name: Acronym Identification
29
+ split: validation
30
+ revision: c3c245a18bbd57b1682b099e14460eebf154cbdf
31
+ metrics:
32
+ - type: f1
33
+ value: 0.9310
34
+ name: F1
35
+ - type: precision
36
+ value: 0.9423
37
+ name: Precision
38
+ - type: recall
39
+ value: 0.9199
40
+ name: Recall
41
+ datasets:
42
+ - acronym_identification
43
+ language:
44
+ - en
45
+ metrics:
46
+ - f1
47
+ - recall
48
+ - precision
49
  ---
50
 
51
+ # SpanMarker for Acronyms Named Entity Recognition
52
 
53
+ This is a [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) model trained on the [acronym_identification](https://huggingface.co/datasets/acronym_identification) dataset. In particular, this SpanMarker model uses [bert-base-cased](https://huggingface.co/bert-base-cased) as the underlying encoder. See [train.py](train.py) for the training script.
54
+
55
+ ## Metrics
56
+
57
+ It achieves the following results on the validation set:
58
+ - Overall Precision: 0.9423
59
+ - Overall Recall: 0.9199
60
+ - Overall F1: 0.9310
61
+ - Overall Accuracy: 0.9830
62
+
63
+ ## Labels
64
+
65
+ | **Label** | **Examples** |
66
+ |-----------|--------------|
67
+ | SHORT | "NLP", "CoQA", "SODA", "SCA" |
68
+ | LONG | "Natural Language Processing", "Conversational Question Answering", "Symposium on Discrete Algorithms", "successive convex approximation" |
69
 
70
 
71
  ## Usage
 
82
  from span_marker import SpanMarkerModel
83
 
84
  # Download from the 🤗 Hub
85
+ model = SpanMarkerModel.from_pretrained("tomaarsen/span_marker_bert_base_acronyms")
86
  # Run inference
87
+ entities = model.predict("Compression algorithms like Principal Component Analysis (PCA) can reduce noise and complexity.")
88
  ```
89
 
90
  See the [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) repository for documentation and additional information on this library.
91
+
92
+ ## Training procedure
93
+
94
+ ### Training hyperparameters
95
+
96
+ The following hyperparameters were used during training:
97
+ - learning_rate: 5e-05
98
+ - train_batch_size: 32
99
+ - eval_batch_size: 32
100
+ - seed: 42
101
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
102
+ - lr_scheduler_type: linear
103
+ - lr_scheduler_warmup_ratio: 0.1
104
+ - num_epochs: 2
105
+
106
+ ### Training results
107
+
108
+ | Training Loss | Epoch | Step | Validation Loss | Overall Precision | Overall Recall | Overall F1 | Overall Accuracy |
109
+ |:-------------:|:-----:|:----:|:---------------:|:-----------------:|:--------------:|:----------:|:----------------:|
110
+ | 0.0109 | 0.31 | 200 | 0.0079 | 0.9202 | 0.8962 | 0.9080 | 0.9765 |
111
+ | 0.0075 | 0.62 | 400 | 0.0070 | 0.9358 | 0.8724 | 0.9030 | 0.9765 |
112
+ | 0.0068 | 0.93 | 600 | 0.0059 | 0.9363 | 0.9203 | 0.9282 | 0.9821 |
113
+ | 0.0057 | 1.24 | 800 | 0.0056 | 0.9372 | 0.9187 | 0.9278 | 0.9824 |
114
+ | 0.0051 | 1.55 | 1000 | 0.0054 | 0.9381 | 0.9170 | 0.9274 | 0.9824 |
115
+ | 0.0054 | 1.86 | 1200 | 0.0053 | 0.9424 | 0.9218 | 0.9320 | 0.9834 |
116
+ | 0.0054 | 2.00 | 1290 | 0.0054 | 0.9423 | 0.9199 | 0.9310 | 0.9830 |
117
+
118
+ ### Framework versions
119
+
120
+ - SpanMarker 1.2.4
121
+ - Transformers 4.31.0
122
+ - Pytorch 1.13.1+cu117
123
+ - Datasets 2.14.3
124
+ - Tokenizers 0.13.2