Update README.md
Browse files
README.md
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
---
|
2 |
license: mit
|
3 |
---
|
4 |
-
Github repo: https://github.com/westlake-repl/ProTrek
|
5 |
|
6 |
## Overview
|
7 |
ProTrek is a multimodal model that integrates protein sequence, protein structure, and text information for better
|
@@ -11,10 +11,12 @@ does.
|
|
11 |
|
12 |
## Model architecture
|
13 |
Protein sequence encoder: [esm2_t12_35M_UR50D](https://huggingface.co/facebook/esm2_t12_35M_UR50D)
|
|
|
14 |
Protein structure encoder: foldseek_t12_35M (identical architecture with esm2 except that the vocabulary only contains 3Di tokens)
|
|
|
15 |
Text encoder: [BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext](https://huggingface.co/microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext)
|
16 |
|
17 |
-
## Obtain embeddings and calculate similarity score (please clone
|
18 |
```
|
19 |
import torch
|
20 |
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
---
|
4 |
+
**Github repo: https://github.com/westlake-repl/ProTrek**
|
5 |
|
6 |
## Overview
|
7 |
ProTrek is a multimodal model that integrates protein sequence, protein structure, and text information for better
|
|
|
11 |
|
12 |
## Model architecture
|
13 |
Protein sequence encoder: [esm2_t12_35M_UR50D](https://huggingface.co/facebook/esm2_t12_35M_UR50D)
|
14 |
+
|
15 |
Protein structure encoder: foldseek_t12_35M (identical architecture with esm2 except that the vocabulary only contains 3Di tokens)
|
16 |
+
|
17 |
Text encoder: [BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext](https://huggingface.co/microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext)
|
18 |
|
19 |
+
## Obtain embeddings and calculate similarity score (please clone our repo first)
|
20 |
```
|
21 |
import torch
|
22 |
|