Pavarissy commited on
Commit
5c82642
1 Parent(s): 7b5846d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -8
README.md CHANGED
@@ -7,6 +7,8 @@ datasets:
7
  - universal_dependencies
8
  metrics:
9
  - accuracy
 
 
10
  model-index:
11
  - name: mdeberta-v3-ud-thai-pud-upos
12
  results:
@@ -23,6 +25,9 @@ model-index:
23
  - name: Accuracy
24
  type: accuracy
25
  value: 0.9934846474601972
 
 
 
26
  ---
27
 
28
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -43,17 +48,21 @@ It achieves the following results on the evaluation set:
43
 
44
  ## Model description
45
 
46
- More information needed
47
 
48
- ## Intended uses & limitations
 
 
49
 
50
- More information needed
 
51
 
52
- ## Training and evaluation data
 
 
 
53
 
54
- More information needed
55
-
56
- ## Training procedure
57
 
58
  ### Training hyperparameters
59
 
@@ -87,4 +96,4 @@ The following hyperparameters were used during training:
87
  - Transformers 4.34.1
88
  - Pytorch 2.1.0+cu118
89
  - Datasets 2.14.6
90
- - Tokenizers 0.14.1
 
7
  - universal_dependencies
8
  metrics:
9
  - accuracy
10
+ - precision
11
+ - recall
12
  model-index:
13
  - name: mdeberta-v3-ud-thai-pud-upos
14
  results:
 
25
  - name: Accuracy
26
  type: accuracy
27
  value: 0.9934846474601972
28
+ language:
29
+ - th
30
+ library_name: transformers
31
  ---
32
 
33
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
48
 
49
  ## Model description
50
 
51
+ This model is train on thai UD Thai PUD corpus with `Universal Part-of-speech (UPOS)` tag to help with pos tagging in Thai language.
52
 
53
+ ## Example
54
+ ```python
55
+ from transformers import AutoModelForTokenClassification, AutoTokenizer, TokenClassificationPipeline
56
 
57
+ model = AutoModelForTokenClassification.from_pretrained("Pavarissy/mdeberta-v3-ud-thai-pud-upos")
58
+ tokenizer = AutoTokenizer.from_pretrained("Pavarissy/mdeberta-v3-ud-thai-pud-upos")
59
 
60
+ pipeline = TokenClassificationPipeline(model=model, tokenizer=tokenizer, grouped_entities=True)
61
+ outputs = pipeline("ประเทศไทย อยู่ใน ทวีป เอเชีย")
62
+ print(outputs)
63
+ # [{'entity_group': 'PROPN', 'score': 0.9946701, 'word': 'ประเทศไทย', 'start': 0, 'end': 9}, {'entity_group': 'VERB', 'score': 0.85809743, 'word': 'อยู่ใน', 'start': 9, 'end': 16}, {'entity_group': 'NOUN', 'score': 0.99632, 'word': 'ทวีป', 'start': 16, 'end': 21}, {'entity_group': 'PROPN', 'score': 0.9961184, 'word': 'เอเชีย', 'start': 21, 'end': 28}]
64
 
65
+ ```
 
 
66
 
67
  ### Training hyperparameters
68
 
 
96
  - Transformers 4.34.1
97
  - Pytorch 2.1.0+cu118
98
  - Datasets 2.14.6
99
+ - Tokenizers 0.14.1