dchaplinsky commited on
Commit
978a264
1 Parent(s): 00c3adc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md CHANGED
@@ -1,3 +1,54 @@
1
  ---
 
 
 
 
 
 
2
  license: mit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ tags:
3
+ - spacy
4
+ - token-classification
5
+ language: uk
6
+ datasets:
7
+ - ner-uk.2.0
8
  license: mit
9
+ model-index:
10
+ - name: uk_ner_web_trf_13class
11
+ results:
12
+ - task:
13
+ name: NER
14
+ type: token-classification
15
+ metrics:
16
+ - name: NER Precision
17
+ type: precision
18
+ value: 0.8977982743
19
+ - name: NER Recall
20
+ type: recall
21
+ value: 0.8860666569
22
+ - name: NER F Score
23
+ type: f_score
24
+ value: 0.891893889
25
+ widget:
26
+ - text: "Президент Володимир Зеленський пояснив, що наразі діалог із режимом Володимира путіна неможливий, адже агресор обрав курс на знищення українського народу. За словами Зеленського цей режим РФ виявляє неповагу до суверенітету і територіальної цілісності України."
27
  ---
28
+ # uk_ner_web_trf_13class
29
+
30
+ ## Model description
31
+
32
+ **uk_ner_web_trf_13class** is a fine-tuned [Roberta Large Ukrainian model](https://huggingface.co/benjamin/roberta-large-wechsel-ukrainian) that is ready to use for **Named Entity Recognition** and achieves a new **SoA** performance for the NER task for Ukrainian language.
33
+
34
+ It has a solid performance and has been trained to recognize **thirteen** types of entities:
35
+ - **ORG** — a name of a company, brand, agency, organization, institution (including religious, informal, non-profit), party, people's association, or specific project like a conference, a music band, a TV program, etc. Example: *UNESCO*.
36
+ - **PERS** — a person name where person may refer to humans, book characters, or humanoid creatures like vampires, ghosts, mermaids, etc. Example: *Marquis de Sade*.
37
+ - **LOC** — a geographical name, including names of districts, villages, cities, states, counties, countries, continents, rivers, lakes, seas, oceans, mountains, etc. Example: *Ukraine*.
38
+ - **MON** — a sum of money including the currency. Examples: *\$40, 1 mln hryvnias*.
39
+ - **PCT** — a percent value including the percent sign or the word "percent". Example: *10\%*.
40
+ - **DATE** — a full or incomplete calendar date that may include a century, a year, a month, a day. Examples: *last week, 10.12.1999*.
41
+ - **TIME** — a textual or numerical timestamp. Examples: *half past six, 18:30*.
42
+ - **PERIOD** — a time period, which may consist of two dates. Examples: *a few months, 2014-2015*.
43
+ - **JOB** — a job title. Examples: *member of parliament, ophthalmologist*.
44
+ - **DOC** — a unique name of a document, including names of contracts, orders, bills, purchases. Example: *procurement contract CW2244226*.
45
+ - **QUANT** — a quantity with the unit of measurement, such as weight, distance, size. Examples: *3 kilograms, a hundred miles*.
46
+ - **ART** (artifact) — a name of a human-made product, like a book, a song, a car, or a sandwich. Examples: *Mona Lisa, iPhone*.
47
+ - **MISC** — any other entity not covered in the list above, like nam*s of holidays, websites, battles, wars, sports events, hurricanes, etc. Example: *Black Friday*.
48
+
49
+ The model was fine-tuned on the [NER-UK 2.0 dataset](https://github.com/lang-uk/ner-uk), released by the [lang-uk](https://lang.org.ua).
50
+
51
+ Another transformer-based model **trained on 4 classes** for the SpaCy is available [here](https://huggingface.co/dchaplinsky/uk_ner_web_trf_best).
52
+
53
+
54
+ Copyright: [Dmytro Chaplynskyi](https://twitter.com/dchaplinsky), [Mariana Romanyshyn](https://scholar.google.com/citations?user=yji2ZvIAAAAJ&hl=uk&oi=ao), [lang-uk project](https://lang.org.ua), 2024