LTEnjoy commited on
Commit
7de2558
1 Parent(s): bf74671

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -0
README.md CHANGED
@@ -1,3 +1,33 @@
1
  ---
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+ This model is provided to compare with official ESM-2 35M model. It only receives residue sequence but shares the same vocabulary with normal SaProt,
5
+ which means all structure tokens are marked as ``#``.
6
+
7
+ ### Huggingface model
8
+ The following code shows how to load the model.
9
+ ```
10
+ from transformers import EsmTokenizer, EsmForMaskedLM
11
+
12
+ model_path = "/your/path/to/SaProt_35M_AF2_seqOnly"
13
+ tokenizer = EsmTokenizer.from_pretrained(model_path)
14
+ model = EsmForMaskedLM.from_pretrained(model_path)
15
+
16
+ #################### Example ####################
17
+ device = "cuda"
18
+ model.to(device)
19
+
20
+ seq = "M#E#V#Q#L#V#Q#Y#K#"
21
+ tokens = tokenizer.tokenize(seq)
22
+ print(tokens)
23
+
24
+ inputs = tokenizer(seq, return_tensors="pt")
25
+ inputs = {k: v.to(device) for k, v in inputs.items()}
26
+
27
+ outputs = model(**inputs)
28
+ print(outputs.logits.shape)
29
+
30
+ """
31
+ ['M#', 'E#', 'V#', 'Q#', 'L#', 'V#', 'Q#', 'Y#', 'K#']
32
+ torch.Size([1, 11, 446])
33
+ """