nampham1106 commited on
Commit
af4db17
1 Parent(s): 97ae579

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -8
README.md CHANGED
@@ -8,6 +8,7 @@ tags:
8
  - transformers
9
  datasets:
10
  - tarudesu/ViHealthQA
 
11
  ---
12
 
13
  # nampham1106/bkcare-embedding
@@ -18,19 +19,24 @@ This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentence
18
 
19
  ## Usage (Sentence-Transformers)
20
 
21
- Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
22
-
23
- ```
24
- pip install -U sentence-transformers
25
- ```
 
 
 
26
 
27
  Then you can use the model like this:
28
 
29
  ```python
30
  from sentence_transformers import SentenceTransformer
31
- sentences = ["This is an example sentence", "Each sentence is converted"]
 
32
 
33
  model = SentenceTransformer('nampham1106/bkcare-embedding')
 
34
  embeddings = model.encode(sentences)
35
  print(embeddings)
36
  ```
@@ -43,7 +49,7 @@ Without [sentence-transformers](https://www.SBERT.net), you can use the model li
43
  ```python
44
  from transformers import AutoTokenizer, AutoModel
45
  import torch
46
-
47
 
48
  #Mean Pooling - Take attention mask into account for correct averaging
49
  def mean_pooling(model_output, attention_mask):
@@ -53,12 +59,13 @@ def mean_pooling(model_output, attention_mask):
53
 
54
 
55
  # Sentences we want sentence embeddings for
56
- sentences = ['This is an example sentence', 'Each sentence is converted']
57
 
58
  # Load model from HuggingFace Hub
59
  tokenizer = AutoTokenizer.from_pretrained('nampham1106/bkcare-embedding')
60
  model = AutoModel.from_pretrained('nampham1106/bkcare-embedding')
61
 
 
62
  # Tokenize sentences
63
  encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
64
 
 
8
  - transformers
9
  datasets:
10
  - tarudesu/ViHealthQA
11
+ license: mit
12
  ---
13
 
14
  # nampham1106/bkcare-embedding
 
19
 
20
  ## Usage (Sentence-Transformers)
21
 
22
+ ### Installation <a name="install1"></a>
23
+ - Install `sentence-transformers`:
24
+
25
+ - `pip install -U sentence-transformers`
26
+
27
+ - Install `pyvi` to word segment:
28
+ - `pip install pyvi`
29
+ ### Example usage <a name="usage1"></a>
30
 
31
  Then you can use the model like this:
32
 
33
  ```python
34
  from sentence_transformers import SentenceTransformer
35
+ from pyvi.ViTokenizer import tokenize
36
+ sentences = ["Đang chích ngừa viêm gan B có chích ngừa Covid-19 được không?", "Nếu anh / chị đang tiêm ngừa vaccine phòng_bệnh viêm_gan B , anh / chị vẫn có_thể tiêm phòng vaccine phòng Covid-19 , tuy_nhiên vaccine Covid-19 phải được tiêm cách trước và sau mũi vaccine viêm gan B tối_thiểu là 14 ngày ."]
37
 
38
  model = SentenceTransformer('nampham1106/bkcare-embedding')
39
+ sentences = [tokenize(sentence) for sentence in sentences]
40
  embeddings = model.encode(sentences)
41
  print(embeddings)
42
  ```
 
49
  ```python
50
  from transformers import AutoTokenizer, AutoModel
51
  import torch
52
+ from pyvi.ViTokenizer import tokenize
53
 
54
  #Mean Pooling - Take attention mask into account for correct averaging
55
  def mean_pooling(model_output, attention_mask):
 
59
 
60
 
61
  # Sentences we want sentence embeddings for
62
+ sentences = ["Đang chích ngừa viêm gan B có chích ngừa Covid-19 được không?", "Nếu anh / chị đang tiêm ngừa vaccine phòng_bệnh viêm_gan B , anh / chị vẫn có_thể tiêm phòng vaccine phòng Covid-19 , tuy_nhiên vaccine Covid-19 phải được tiêm cách trước và sau mũi vaccine viêm gan B tối_thiểu là 14 ngày ."]
63
 
64
  # Load model from HuggingFace Hub
65
  tokenizer = AutoTokenizer.from_pretrained('nampham1106/bkcare-embedding')
66
  model = AutoModel.from_pretrained('nampham1106/bkcare-embedding')
67
 
68
+ sentences = [tokenize(sentence) for sentence in sentences]
69
  # Tokenize sentences
70
  encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
71