vitus9988 commited on
Commit
2ff8199
1 Parent(s): 12d85b8

End of training

Browse files
Files changed (2) hide show
  1. README.md +27 -59
  2. model.safetensors +1 -1
README.md CHANGED
@@ -1,50 +1,43 @@
1
  ---
2
- base_model: klue/roberta-small
3
  tags:
4
  - generated_from_trainer
5
- - korean
6
- - klue
7
- widget:
8
- - text: 저는 서울특별시 강남대로에 삽니다. 전화번호는 010-1234-5678이고 주민등록번호는 123456-1234567입니다. 메일주소는 hugging@face.com입니다.
9
  metrics:
10
  - precision
11
  - recall
12
  - f1
13
  - accuracy
14
  model-index:
15
- - name: klue_roberta_small_ner_identified
16
  results: []
17
- language:
18
- - ko
19
- pipeline_tag: token-classification
20
  ---
21
 
22
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
23
  should probably proofread and complete it, then remove this comment. -->
24
 
25
- # klue_roberta_small_ner_identified
26
 
27
- This model is a fine-tuned version of [klue/roberta-small](https://huggingface.co/klue/roberta-small) on an unknown dataset.
28
  It achieves the following results on the evaluation set:
29
- - Loss: 0.0212
30
- - Precision: 0.9803
31
- - Recall: 1.0
32
- - F1: 0.9901
33
- - Accuracy: 0.9980
34
 
35
  ## Model description
36
 
37
- 아래 항목에 대한 개체명 인식을 제공합니다.
38
 
39
- - 사람이름 [PS] - 낮은 인식률
40
- - 주소 (구 주소 및 도로명 주소) [AD]
41
- - 카드번호 [CN]
42
- - 계좌번호 [BN]
43
- - 운전면허번호 [DN]
44
- - 주민등록번호 [RN]
45
- - 여권번호 [PN]
46
- - 전화번호 [PH]
47
- - 이메일 주소 [EM]
48
 
49
  ### Training hyperparameters
50
 
@@ -61,14 +54,14 @@ The following hyperparameters were used during training:
61
 
62
  | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
63
  |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:|
64
- | No log | 1.0 | 15 | 0.2866 | 0.1199 | 0.2739 | 0.1668 | 0.9287 |
65
- | No log | 2.0 | 30 | 0.1369 | 0.6599 | 0.7996 | 0.7231 | 0.9654 |
66
- | No log | 3.0 | 45 | 0.0629 | 0.8088 | 0.9042 | 0.8538 | 0.9915 |
67
- | No log | 4.0 | 60 | 0.0381 | 0.9760 | 0.9978 | 0.9868 | 0.9969 |
68
- | No log | 5.0 | 75 | 0.0276 | 0.9781 | 0.9955 | 0.9868 | 0.9981 |
69
- | No log | 6.0 | 90 | 0.0238 | 0.9803 | 1.0 | 0.9901 | 0.9979 |
70
- | No log | 7.0 | 105 | 0.0224 | 0.9803 | 1.0 | 0.9901 | 0.9979 |
71
- | No log | 8.0 | 120 | 0.0212 | 0.9803 | 1.0 | 0.9901 | 0.9980 |
72
 
73
 
74
  ### Framework versions
@@ -77,28 +70,3 @@ The following hyperparameters were used during training:
77
  - Pytorch 2.3.0+cu118
78
  - Datasets 2.19.1
79
  - Tokenizers 0.19.1
80
-
81
- ### Use
82
- ```python
83
- from transformers import AutoTokenizer, AutoModelForTokenClassification
84
- from transformers import pipeline
85
-
86
- tokenizer = AutoTokenizer.from_pretrained("vitus9988/klue-roberta-small-ner-identified")
87
- model = AutoModelForTokenClassification.from_pretrained("vitus9988/klue-roberta-small-ner-identified")
88
-
89
- nlp = pipeline("ner", model=model, tokenizer=tokenizer, aggregation_strategy="simple")
90
- example = """
91
- 저는 서울특별시 강남대로 56길 100호에 삽니다. 전화번호는 010-1234-5678이고 주민등록번호는 123456-1234567입니다. 메일주소는 hugging@face.com입니다.
92
- """
93
-
94
- ner_results = nlp(example)
95
- for i in ner_results:
96
- print(i)
97
-
98
- #{'entity_group': 'AD', 'score': 0.79996574, 'word': '서울특별시 강남대로 56길 100호', 'start': 4, 'end': 23}
99
- #{'entity_group': 'PH', 'score': 0.948794, 'word': '010 - 1234 - 5678', 'start': 36, 'end': 49}
100
- #{'entity_group': 'RN', 'score': 0.90686846, 'word': '123456 - 1234567', 'start': 60, 'end': 74}
101
- #{'entity_group': 'EM', 'score': 0.935588, 'word': 'hugging @ face. com', 'start': 85, 'end': 101}
102
-
103
- ```
104
-
 
1
  ---
2
+ base_model: vitus9988/klue-roberta-small-ner-identified
3
  tags:
4
  - generated_from_trainer
 
 
 
 
5
  metrics:
6
  - precision
7
  - recall
8
  - f1
9
  - accuracy
10
  model-index:
11
+ - name: klue-roberta-small-ner-identified
12
  results: []
 
 
 
13
  ---
14
 
15
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
16
  should probably proofread and complete it, then remove this comment. -->
17
 
18
+ # klue-roberta-small-ner-identified
19
 
20
+ This model is a fine-tuned version of [vitus9988/klue-roberta-small-ner-identified](https://huggingface.co/vitus9988/klue-roberta-small-ner-identified) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
+ - Loss: 0.1304
23
+ - Precision: 0.9222
24
+ - Recall: 0.9520
25
+ - F1: 0.9369
26
+ - Accuracy: 0.9790
27
 
28
  ## Model description
29
 
30
+ More information needed
31
 
32
+ ## Intended uses & limitations
33
+
34
+ More information needed
35
+
36
+ ## Training and evaluation data
37
+
38
+ More information needed
39
+
40
+ ## Training procedure
41
 
42
  ### Training hyperparameters
43
 
 
54
 
55
  | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
56
  |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:|
57
+ | No log | 1.0 | 4 | 0.8023 | 0.1377 | 0.1231 | 0.1300 | 0.9178 |
58
+ | No log | 2.0 | 8 | 0.4197 | 0.5419 | 0.5580 | 0.5498 | 0.9431 |
59
+ | No log | 3.0 | 12 | 0.2760 | 0.6764 | 0.7146 | 0.6950 | 0.9564 |
60
+ | No log | 4.0 | 16 | 0.2062 | 0.7835 | 0.8544 | 0.8174 | 0.9617 |
61
+ | No log | 5.0 | 20 | 0.1685 | 0.8299 | 0.8946 | 0.8610 | 0.9711 |
62
+ | No log | 6.0 | 24 | 0.1470 | 0.8854 | 0.9295 | 0.9069 | 0.9758 |
63
+ | No log | 7.0 | 28 | 0.1350 | 0.9138 | 0.9460 | 0.9297 | 0.9778 |
64
+ | No log | 8.0 | 32 | 0.1304 | 0.9222 | 0.9520 | 0.9369 | 0.9790 |
65
 
66
 
67
  ### Framework versions
 
70
  - Pytorch 2.3.0+cu118
71
  - Datasets 2.19.1
72
  - Tokenizers 0.19.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:37c9d1f23c210be3aecf790eb851ecea7030ff68a58695bd444f22375f8427b8
3
  size 270078052
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:62ecc238c318744c81341f4e8ebaf5721a004d42a901ad703449016993e4dd4f
3
  size 270078052