File size: 3,110 Bytes
d1d5174
 
 
 
 
0e98652
 
 
 
 
 
 
 
 
 
 
 
f177061
0e98652
f177061
0e98652
f177061
 
0e98652
f177061
0e98652
f177061
 
0e98652
f177061
0e98652
f177061
 
0e98652
f177061
0e98652
f177061
 
0e98652
f177061
0e98652
f177061
d1d5174
 
fd01248
d1d5174
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
---
language: en
license: apache-2.0
datasets:
- conll2003
model-index:
- name: elastic/distilbert-base-uncased-finetuned-conll03-english
  results:
  - task:
      type: token-classification
      name: Token Classification
    dataset:
      name: conll2003
      type: conll2003
      config: conll2003
      split: validation
    metrics:
    - type: accuracy
      value: 0.9854480753649896
      name: Accuracy
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZmM0NzNhYTM2NGU0YjMwZDMwYTdhYjY3MDgwMTYxNWRjYzQ1NmE0OGEwOTcxMGY5ZTU1ZTQ3OTM5OGZkYjE2NCIsInZlcnNpb24iOjF9.v8Mk62C40vRWQ78BSCtGyphKKHd6q-Ir6sVbSjNjG37j9oiuQN3CDmk9XItmjvCwyKwMEr2NqUXaSyIfUSpBDg
    - type: precision
      value: 0.9880928983228512
      name: Precision
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMWIzYTg2OTFjY2FkNWY4MzUyN2ZjOGFlYWNhODYzODVhYjQwZTQ3YzdhMzMxY2I4N2U0YWI1YWVlYjIxMDdkNCIsInZlcnNpb24iOjF9.A50vF5qWgZjxABjL9tc0vssFxYHYhBQ__hLXcvuoZoK8c2TyuODHcM0LqGLeRJF8kcPaLx1hcNk3QMdOETVQBA
    - type: recall
      value: 0.9895677847945542
      name: Recall
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYzBiZDg1YmM2NzFkNjQ3MzUzN2QzZDAwNzUwMmM3MzU1ODBlZWJjYmI1YzIxM2YxMzMzNDUxYjkyYzQzMDQ3ZSIsInZlcnNpb24iOjF9.aZEC0c93WWn3YoPkjhe2W1-OND9U2qWzesL9zioNuhstbj7ftANERs9dUAaJIlNCb7NS28q3x9c2s6wGLwovCw
    - type: f1
      value: 0.9888297915932504
      name: F1
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYmNkNzVhODJjMjExOTg4ZjQwMWM4NGIxZGNiZTZlMDk5MzNmMjIwM2ZiNzdiZGIxYmNmNmJjMGVkYTlkN2FlNiIsInZlcnNpb24iOjF9.b6qmLHkHu-z5V1wC2yQMyIcdeReptK7iycIMyGOchVy6WyG4flNbxa5f2W05INdnJwX-PHavB_yaY0oULdKWDQ
    - type: loss
      value: 0.06707527488470078
      name: loss
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNDRlMWE2OTQxNWI5MjY0NzJjNjJkYjg1OWE1MjE2MjI4N2YzOWFhMDI3OTE0ZmFhM2M0ZWU0NTUxNTBiYjhiZiIsInZlcnNpb24iOjF9.6JhhyfhXxi76GRLUNqekU_SRVsV-9Hwpm2iOD_OJusPZTIrEUCmLdIWtb9abVNWNzMNOmA4TkRLqLVca0o0HAw
---

[DistilBERT base uncased](https://huggingface.co/distilbert-base-uncased), fine-tuned for NER using the [conll03 english dataset](https://huggingface.co/datasets/conll2003). Note that this model is **not** sensitive to capital letters — "english" is the same as "English". For the case sensitive version, please use [elastic/distilbert-base-cased-finetuned-conll03-english](https://huggingface.co/elastic/distilbert-base-cased-finetuned-conll03-english).

## Versions

- Transformers version: 4.3.1
- Datasets version: 1.3.0

## Training

```
$ run_ner.py \
  --model_name_or_path distilbert-base-uncased \
  --label_all_tokens True \
  --return_entity_level_metrics True \
  --dataset_name conll2003 \
  --output_dir /tmp/distilbert-base-uncased-finetuned-conll03-english \
  --do_train \
  --do_eval
```

After training, we update the labels to match the NER specific labels from the
dataset [conll2003](https://raw.githubusercontent.com/huggingface/datasets/1.3.0/datasets/conll2003/dataset_infos.json)