Alan Joshua commited on
Commit
893a410
·
verified ·
1 Parent(s): 8c9803b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -20
README.md CHANGED
@@ -54,26 +54,6 @@ A 34M parameter sentence embedding model trained from scratch using PyTorch.
54
  - `onnx/biencoder_rope.onnx` — ONNX FP32
55
  - `onnx/biencoder_rope_int8.onnx` — ONNX INT8 (recommended for CPU)
56
 
57
- ## Usage
58
- ```python
59
- import torch
60
- from transformers import AutoTokenizer
61
-
62
- tokenizer = AutoTokenizer.from_pretrained("your-username/your-model-name", subfolder="tokenizer")
63
- model = BiEncoderRoPE().to("cuda")
64
- model.load_state_dict(
65
- torch.load("pytorch/checkpoint_phase4_nq.pt")["model_state"]
66
- )
67
- model.eval()
68
-
69
- @torch.no_grad()
70
- def encode(texts):
71
- if isinstance(texts, str): texts = [texts]
72
- enc = tokenizer(texts, padding=True, truncation=True,
73
- max_length=256, return_tensors="pt")
74
- return model.encode(enc["input_ids"].cuda(), enc["attention_mask"].cuda()).cpu()
75
- ```
76
-
77
  ## Performance
78
  - FP32 ONNX size : 134.3 MB
79
  - INT8 ONNX size : 34.6 MB
 
54
  - `onnx/biencoder_rope.onnx` — ONNX FP32
55
  - `onnx/biencoder_rope_int8.onnx` — ONNX INT8 (recommended for CPU)
56
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
  ## Performance
58
  - FP32 ONNX size : 134.3 MB
59
  - INT8 ONNX size : 34.6 MB