File size: 1,433 Bytes
6ae284a
 
025a899
6ae284a
 
 
 
 
 
 
 
 
 
 
71590a6
6ae284a
 
 
 
 
 
 
 
3059ab4
6ae284a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7805754
 
c17d606
fc74b4a
b948ffd
fc74b4a
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
---
language:
- bg
license: mit
pipeline_tag: token-classification
model-index:
- name: punctual-bert-bg
  results: []
widget:
- text: 'Човекът искащ безгрижно писане ме помоли да създам този модел.'
---


# punctual-bert-bg
Visit the website - [Zapetayko](https://zapetayko.streamlit.app/), to test out the model.

## Usage
```python
from transformers import pipeline


MODEL_ID = "auhide/punctual-bert-bg"

punctuate = pipeline("token-classification", model=MODEL_ID, tokenizer=MODEL_ID)
punctuate("Човекът искащ безгрижно писане ме помоли да създам този модел.")
```
```bash
[{'entity': 'B-CMA',
  'score': 0.95041466,
  'index': 1,
  'word': '▁Човекът',
  'start': 0,
  'end': 7},
 {'entity': 'I-CMA',
  'score': 0.95229745,
  'index': 2,
  'word': '▁иска',
  'start': 7,
  'end': 12},
 {'entity': 'B-CMA',
  'score': 0.95945585,
  'index': 5,
  'word': '▁писане',
  'start': 23,
  'end': 30},
 {'entity': 'I-CMA',
  'score': 0.90768945,
  'index': 6,
  'word': '▁ме',
  'start': 30,
  'end': 33}]
```

Basically, `B-CMA` tags the token that's before the comma, and `I-CMA` tags the token after the comma.

Therefore, if we place the commas based on these tags, the result is:

*"Човекът, искащ безгрижно писане, ме помоли да създам този модел."*