vaibhavad commited on
Commit
be76331
1 Parent(s): f1cbc9d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -11
README.md CHANGED
@@ -1,20 +1,36 @@
1
  ---
2
  library_name: transformers
3
- tags: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  ---
5
 
6
- # LLM2Vec
7
 
8
- > 2 line summary of our contribution
9
- >
10
- - **Repository:**
11
- - **Paper:**
12
 
13
 
14
-
15
- ## Quick start
16
- <hr />
17
-
18
  ## Installation
19
  ```bash
20
  pip install llm2vec
@@ -22,7 +38,51 @@ pip install llm2vec
22
 
23
  ## Usage
24
  ```python
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
 
 
 
 
 
 
26
  ```
27
 
28
- ## Citation
 
 
1
  ---
2
  library_name: transformers
3
+ license: mit
4
+ language:
5
+ - en
6
+ pipeline_tag: sentence-similarity
7
+ tags:
8
+ - text-embedding
9
+ - embeddings
10
+ - information-retrieval
11
+ - beir
12
+ - text-classification
13
+ - language-model
14
+ - text-clustering
15
+ - text-semantic-similarity
16
+ - text-evaluation
17
+ - text-reranking
18
+ - feature-extraction
19
+ - sentence-similarity
20
+ - Sentence Similarity
21
+ - natural_questions
22
+ - ms_marco
23
+ - fever
24
+ - hotpot_qa
25
+ - mteb
26
  ---
27
 
28
+ # LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
29
 
30
+ > LLM2Vec is a simple recipe to convert decoder-only LLMs into text encoders. It consists of 3 simple steps: 1) enabling bidirectional attention, 2) masked next token prediction, and 3) unsupervised contrastive learning. The model can be further fine-tuned to achieve state-of-the-art performance.
31
+ - **Repository:** https://github.com/McGill-NLP/llm2vec
 
 
32
 
33
 
 
 
 
 
34
  ## Installation
35
  ```bash
36
  pip install llm2vec
 
38
 
39
  ## Usage
40
  ```python
41
+ from llm2vec import LLM2Vec
42
+
43
+ import torch
44
+ from transformers import AutoTokenizer, AutoModel, AutoConfig
45
+ from peft import PeftModel
46
+
47
+ # Loading base Mistral model, along with custom code that enables bidirectional connections in decoder-only LLMs. MNTP LoRA weights are merged into the base model.
48
+ .tokenizer = AutoTokenizer.from_pretrained("McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp")
49
+ config = AutoConfig.from_pretrained("McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp", trust_remote_code=True)
50
+ model = AutoModel.from_pretrained("McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp", trust_remote_code=True, config=config, torch_dtype=torch.bfloat16)
51
+ model = PeftModel.from_pretrained(model, "McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp")
52
+ model = model.merge_and_unload() # This can take several minutes
53
+
54
+ # Loading supervised model. This loads the trained LoRA weights on top of MNTP model. Hence the final weights are -- Base model + MNTP (LoRA) + supervised (LoRA).
55
+ model = PeftModel.from_pretrained(model, "McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp-supervised")
56
+
57
+ # Wrapper for encoding and pooling operations
58
+ l2v = LLM2Vec(model, tokenizer, pooling_mode="mean", max_length=512)
59
+
60
+ # Encoding queries using instructions
61
+ instruction = 'Given a web search query, retrieve relevant passages that answer the query:'
62
+ queries = [
63
+ [instruction, 'how much protein should a female eat'],
64
+ [instruction, 'summit define']
65
+ ]
66
+ q_reps = l2v.encode(queries)
67
+
68
+ # Encoding documents. Instruction are not required for documents
69
+ documents = [
70
+ "As a general guideline, the CDC's average requirement of protein for women ages 19 to 70 is 46 grams per day. But, as you can see from this chart, you'll need to increase that if you're expecting or training for a marathon. Check out the chart below to see how much protein you should be eating each day.",
71
+ "Definition of summit for English Language Learners. : 1 the highest point of a mountain : the top of a mountain. : 2 the highest level. : 3 a meeting or series of meetings between the leaders of two or more governments."
72
+ ]
73
+ d_reps = l2v.encode(documents)
74
+
75
+ # Compute cosine similarity
76
+ q_reps_norm = torch.nn.functional.normalize(q_reps, p=2, dim=1)
77
+ d_reps_norm = torch.nn.functional.normalize(d_reps, p=2, dim=1)
78
+ cos_sim = torch.mm(q_reps_norm, d_reps_norm.transpose(0, 1))
79
 
80
+ print(cos_sim)
81
+ """
82
+ tensor([[0.5486, 0.0554],
83
+ [0.0567, 0.5437]])
84
+ """
85
  ```
86
 
87
+ ## Questions
88
+ If you have any question about the code, feel free to email Parishad (`parishad.behnamghader@mila.quebec`) and Vaibhav (`vaibhav.adlakha@mila.quebec`).