ritaranx commited on
Commit
7030ca2
1 Parent(s): b92c386

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -2
README.md CHANGED
@@ -52,6 +52,7 @@ queries = [
52
  get_detailed_instruct_query(task, 'Cis-acting lncRNAs control the expression of genes that are positioned in the vicinity of their transcription sites.'),
53
  get_detailed_instruct_query(task, 'Forkhead 0 (fox0) transcription factors are involved in apoptosis.')
54
  ]
 
55
  # No need to add instruction for retrieval documents
56
  documents = [
57
  get_detailed_instruct_passage("Gene regulation by the act of long non-coding RNA transcription Long non-protein-coding RNAs (lncRNAs) are proposed to be the largest transcript class in the mouse and human transcriptomes. Two important questions are whether all lncRNAs are functional and how they could exert a function. Several lncRNAs have been shown to function through their product, but this is not the only possible mode of action. In this review we focus on a role for the process of lncRNA transcription, independent of the lncRNA product, in regulating protein-coding-gene activity in cis. We discuss examples where lncRNA transcription leads to gene silencing or activation, and describe strategies to determine if the lncRNA product or its transcription causes the regulatory effect."),
@@ -60,16 +61,18 @@ documents = [
60
  input_texts = queries + documents
61
 
62
  max_length = 512
 
63
  # Tokenize the input texts
64
  batch_dict = tokenizer(input_texts, max_length=max_length, padding=True, truncation=True, return_tensors='pt')
65
 
66
  model.eval()
67
  with torch.no_grad():
68
- outputs = model(**batch_dict)
69
- embeddings = last_token_pool(outputs.last_hidden_state, batch_dict['attention_mask'])
70
  ```
71
 
72
  Then similarity scores between the different sentences are obtained with a dot product between the embeddings:
 
73
  ```python
74
  scores = (embeddings[:2] @ embeddings[2:].T)
75
  print(scores.tolist())
 
52
  get_detailed_instruct_query(task, 'Cis-acting lncRNAs control the expression of genes that are positioned in the vicinity of their transcription sites.'),
53
  get_detailed_instruct_query(task, 'Forkhead 0 (fox0) transcription factors are involved in apoptosis.')
54
  ]
55
+
56
  # No need to add instruction for retrieval documents
57
  documents = [
58
  get_detailed_instruct_passage("Gene regulation by the act of long non-coding RNA transcription Long non-protein-coding RNAs (lncRNAs) are proposed to be the largest transcript class in the mouse and human transcriptomes. Two important questions are whether all lncRNAs are functional and how they could exert a function. Several lncRNAs have been shown to function through their product, but this is not the only possible mode of action. In this review we focus on a role for the process of lncRNA transcription, independent of the lncRNA product, in regulating protein-coding-gene activity in cis. We discuss examples where lncRNA transcription leads to gene silencing or activation, and describe strategies to determine if the lncRNA product or its transcription causes the regulatory effect."),
 
61
  input_texts = queries + documents
62
 
63
  max_length = 512
64
+
65
  # Tokenize the input texts
66
  batch_dict = tokenizer(input_texts, max_length=max_length, padding=True, truncation=True, return_tensors='pt')
67
 
68
  model.eval()
69
  with torch.no_grad():
70
+ outputs = model(**batch_dict)
71
+ embeddings = last_token_pool(outputs.last_hidden_state, batch_dict['attention_mask'])
72
  ```
73
 
74
  Then similarity scores between the different sentences are obtained with a dot product between the embeddings:
75
+
76
  ```python
77
  scores = (embeddings[:2] @ embeddings[2:].T)
78
  print(scores.tolist())