File size: 1,524 Bytes
44e941c
 
 
 
858dc89
44e941c
 
 
 
012c070
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
---
tags:
- pytorch_model_hub_mixin
- model_hub_mixin
license: gpl-3.0
---

This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
- Library: [More Information Needed]
- Docs: [More Information Needed]

## Steps to run model
- First install [transforna](https://github.com/gitHBDX/TransfoRNA/tree/master)
- Example code:
```
from transforna import GeneEmbeddModel,RnaTokenizer
import torch
model_name = 'Seq-Struct'
model_path = f"HBDX/{model_name}-TransfoRNA"

#load model and tokenizer
model = GeneEmbeddModel.from_pretrained(model_path)
model.eval()

#init tokenizer. Tokenizer will automatically get secondary structure of sequence using Vienna RNA package
tokenizer = RnaTokenizer.from_pretrained(model_path,model_name=model_name)
output = tokenizer(['AAAGTCGGAGGTTCGAAGACGATCAGATAC','TTTTCGGAACTGAGGCCATGATTAAGAGGG'])

#inference
#gene_embedds and second input embedds are the latent space representation of the input sequence and the second input respectively.
#In this case, the second input would be the secondary structure of the sequence
gene_embedd, second_input_embedd, activations,attn_scores_first,attn_scores_second = \
                                model(output['input_ids'])


#get sub class labels
sub_class_labels = model.convert_ids_to_labels(activations)

#get major class labels
major_class_labels = model.convert_subclass_to_majorclass(sub_class_labels)

  ```