tomaarsen HF staff commited on
Commit
496a0e5
1 Parent(s): 3a00e51

Upload model

Browse files
Files changed (5) hide show
  1. README.md +8 -11
  2. config.json +3 -2
  3. pytorch_model.bin +1 -1
  4. tokenizer.json +2 -16
  5. tokenizer_config.json +1 -1
README.md CHANGED
@@ -1,21 +1,18 @@
 
1
  ---
2
  license: apache-2.0
3
- library_name: span_marker
4
  tags:
5
- - span_marker
6
  - token-classification
7
  - ner
8
  - named-entity-recognition
9
  pipeline_tag: token-classification
10
- datasets:
11
- - conll2003
12
- language:
13
- - en
14
  ---
15
 
16
  # SpanMarker for Named Entity Recognition
17
 
18
- This is a [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) model that can be used for Named Entity Recognition. In particular, this SpanMarker model uses [prajjwal1/bert-tiny](https://huggingface.co/prajjwal1/bert-tiny) as the underlying encoder.
19
 
20
  ## Usage
21
 
@@ -25,15 +22,15 @@ To use this model for inference, first install the `span_marker` library:
25
  pip install span_marker
26
  ```
27
 
28
- You can then run inference as follows:
29
 
30
  ```python
31
  from span_marker import SpanMarkerModel
32
 
33
- # Download from Hub and run inference
34
- model = SpanMarkerModel.from_pretrained("tomaarsen/span-marker-bert-tiny-conll03")
35
  # Run inference
36
  entities = model.predict("Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic to Paris.")
37
  ```
38
 
39
- See the [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) repository for documentation and additional information on this model framework.
 
1
+
2
  ---
3
  license: apache-2.0
4
+ library_name: span-marker
5
  tags:
6
+ - span-marker
7
  - token-classification
8
  - ner
9
  - named-entity-recognition
10
  pipeline_tag: token-classification
 
 
 
 
11
  ---
12
 
13
  # SpanMarker for Named Entity Recognition
14
 
15
+ This is a [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) model that can be usedfor Named Entity Recognition. In particular, this SpanMarker model uses [prajjwal1/bert-tiny](https://huggingface.co/prajjwal1/bert-tiny) as the underlying encoder.
16
 
17
  ## Usage
18
 
 
22
  pip install span_marker
23
  ```
24
 
25
+ You can then run inference with this model like so:
26
 
27
  ```python
28
  from span_marker import SpanMarkerModel
29
 
30
+ # Download from the 🤗 Hub
31
+ model = SpanMarkerModel.from_pretrained("span_marker_model_name")
32
  # Run inference
33
  entities = model.predict("Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic to Paris.")
34
  ```
35
 
36
+ See the [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) repository for documentation and additional information on this library.
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "models\\bt-conll-5\\checkpoint-final",
3
  "architectures": [
4
  "SpanMarkerModel"
5
  ],
@@ -123,9 +123,10 @@
123
  "PER": 4
124
  },
125
  "marker_max_length": 256,
126
- "model_max_length": null,
127
  "model_max_length_default": 512,
128
  "model_type": "span-marker",
 
129
  "torch_dtype": "float32",
130
  "transformers_version": "4.27.2",
131
  "vocab_size": 30524
 
1
  {
2
+ "_name_or_path": "models\\bt-conll-full-4\\checkpoint-final",
3
  "architectures": [
4
  "SpanMarkerModel"
5
  ],
 
123
  "PER": 4
124
  },
125
  "marker_max_length": 256,
126
+ "model_max_length": 256,
127
  "model_max_length_default": 512,
128
  "model_type": "span-marker",
129
+ "span_marker_version": "1.0.0.dev",
130
  "torch_dtype": "float32",
131
  "transformers_version": "4.27.2",
132
  "vocab_size": 30524
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:217dcde95ab73c6fd2f420aef483162e552947b65ef33af93801745e9885ed9e
3
  size 17567279
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c5340b64f24a69f1b52c62e345b1932f8fc9e71c785b09d35fb522e3e371ddba
3
  size 17567279
tokenizer.json CHANGED
@@ -1,21 +1,7 @@
1
  {
2
  "version": "1.0",
3
- "truncation": {
4
- "direction": "Right",
5
- "max_length": 256,
6
- "strategy": "LongestFirst",
7
- "stride": 0
8
- },
9
- "padding": {
10
- "strategy": {
11
- "Fixed": 256
12
- },
13
- "direction": "Right",
14
- "pad_to_multiple_of": null,
15
- "pad_id": 0,
16
- "pad_type_id": 0,
17
- "pad_token": "[PAD]"
18
- },
19
  "added_tokens": [
20
  {
21
  "id": 0,
 
1
  {
2
  "version": "1.0",
3
+ "truncation": null,
4
+ "padding": null,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  "added_tokens": [
6
  {
7
  "id": 0,
tokenizer_config.json CHANGED
@@ -4,7 +4,7 @@
4
  "do_basic_tokenize": true,
5
  "do_lower_case": true,
6
  "mask_token": "[MASK]",
7
- "model_max_length": 256,
8
  "never_split": null,
9
  "pad_token": "[PAD]",
10
  "sep_token": "[SEP]",
 
4
  "do_basic_tokenize": true,
5
  "do_lower_case": true,
6
  "mask_token": "[MASK]",
7
+ "model_max_length": 1000000000000000019884624838656,
8
  "never_split": null,
9
  "pad_token": "[PAD]",
10
  "sep_token": "[SEP]",