danfu09 commited on
Commit
ba40ea2
1 Parent(s): 6082b2d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -8
README.md CHANGED
@@ -2,7 +2,7 @@
2
  license: apache-2.0
3
  language:
4
  - en
5
- pipeline_tag: sentence-similarity
6
  inference: false
7
  ---
8
 
@@ -11,6 +11,8 @@ inference: false
11
  The 80M checkpoint for M2-BERT-base from the paper [Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture](https://arxiv.org/abs/2310.12109).
12
  This model has been pretrained with sequence length 2048, and it has been fine-tuned for long-context retrieval.
13
 
 
 
14
  This model was trained by Jon Saad-Falcon, Dan Fu, and Simran Arora.
15
 
16
  Check out our [GitHub](https://github.com/HazyResearch/m2/tree/main) for instructions on how to download and fine-tune it!
@@ -19,21 +21,72 @@ Check out our [GitHub](https://github.com/HazyResearch/m2/tree/main) for instruc
19
 
20
  You can load this model using Hugging Face `AutoModel`:
21
  ```python
22
- from transformers import AutoModelForMaskedLM
23
- model = AutoModelForMaskedLM.from_pretrained("togethercomputer/m2-bert-80M-2k-retrieval", trust_remote_code=True)
 
 
 
24
  ```
25
 
 
 
 
26
  This model generates embeddings for retrieval. The embeddings have a dimensionality of 768:
27
- ```
28
- from transformers import AutoTokenizer, AutoModelForMaskedLM
29
 
30
  max_seq_length = 2048
31
  testing_string = "Every morning, I make a cup of coffee to start my day."
32
- model = AutoModelForMaskedLM.from_pretrained("togethercomputer/m2-bert-80M-2k-retrieval", trust_remote_code=True)
 
 
 
33
 
34
- tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased", model_max_length=max_seq_length)
35
- input_ids = tokenizer([testing_string], return_tensors="pt", padding="max_length", return_token_type_ids=False, truncation=True, max_length=max_seq_length)
 
 
 
 
 
 
 
 
 
 
36
 
37
  outputs = model(**input_ids)
38
  embeddings = outputs['sentence_embedding']
39
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  language:
4
  - en
5
+ pipeline_tag: text-classification
6
  inference: false
7
  ---
8
 
 
11
  The 80M checkpoint for M2-BERT-base from the paper [Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture](https://arxiv.org/abs/2310.12109).
12
  This model has been pretrained with sequence length 2048, and it has been fine-tuned for long-context retrieval.
13
 
14
+ Check out our [blog post]() for more on how we trained this model for long sequence.
15
+
16
  This model was trained by Jon Saad-Falcon, Dan Fu, and Simran Arora.
17
 
18
  Check out our [GitHub](https://github.com/HazyResearch/m2/tree/main) for instructions on how to download and fine-tune it!
 
21
 
22
  You can load this model using Hugging Face `AutoModel`:
23
  ```python
24
+ from transformers import AutoModelForSequenceClassification
25
+ model = AutoModelForSequenceClassification.from_pretrained(
26
+ "togethercomputer/m2-bert-80M-2k-retrieval",
27
+ trust_remote_code=True
28
+ )
29
  ```
30
 
31
+ You should expect to see a large error message about unused parameters for FlashFFTConv.
32
+ If you'd like to load the model with FlashFFTConv, you can check out our [GitHub](https://github.com/HazyResearch/m2/tree/main).
33
+
34
  This model generates embeddings for retrieval. The embeddings have a dimensionality of 768:
35
+ ```python
36
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
37
 
38
  max_seq_length = 2048
39
  testing_string = "Every morning, I make a cup of coffee to start my day."
40
+ model = AutoModelForSequenceClassification.from_pretrained(
41
+ "togethercomputer/m2-bert-80M-2k-retrieval",
42
+ trust_remote_code=True
43
+ )
44
 
45
+ tokenizer = AutoTokenizer.from_pretrained(
46
+ "bert-base-uncased",
47
+ model_max_length=max_seq_length
48
+ )
49
+ input_ids = tokenizer(
50
+ [testing_string],
51
+ return_tensors="pt",
52
+ padding="max_length",
53
+ return_token_type_ids=False,
54
+ truncation=True,
55
+ max_length=max_seq_length
56
+ )
57
 
58
  outputs = model(**input_ids)
59
  embeddings = outputs['sentence_embedding']
60
  ```
61
+
62
+ You can also get embeddings from this model using the Together API as follows (you can find your API key [here](https://api.together.xyz/settings/api-keys)):
63
+ ```python
64
+ import os
65
+ import requests
66
+
67
+ def generate_together_embeddings(text: str, model_api_string: str, api_key: str):
68
+ url = "https://api.together.xyz/api/v1/embeddings"
69
+ headers = {
70
+ "accept": "application/json",
71
+ "content-type": "application/json",
72
+ "Authorization": f"Bearer {api_key}"
73
+ }
74
+ session = requests.Session()
75
+ response = session.post(
76
+ url,
77
+ headers=headers,
78
+ json={
79
+ "input": text,
80
+ "model": model_api_string
81
+ }
82
+ )
83
+ if response.status_code != 200:
84
+ raise ValueError(f"Request failed with status code {response.status_code}: {response.text}")
85
+ return response.json()['data'][0]['embedding']
86
+
87
+ print(generate_together_embeddings(
88
+ 'Hello world',
89
+ 'togethercomputer/m2-bert-80M-2k-retrieval',
90
+ os.environ['TOGETHER_API_KEY'])[:10]
91
+ )
92
+ ```