shanearora commited on
Commit
80d8a2e
1 Parent(s): 6a4152c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -17
README.md CHANGED
@@ -14,7 +14,7 @@ language:
14
 
15
  <!-- Provide a quick summary of what the model is/does. -->
16
 
17
- **For transformers versions v4.40.0 or newer, please use [OLMo 1B HF](https://huggingface.co/allenai/OLMo-1B-hf) instead.**
18
 
19
  OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
20
  The OLMo models are trained on the [Dolma](https://huggingface.co/datasets/allenai/dolma) dataset.
@@ -42,8 +42,9 @@ In particular, we focus on four revisions of the 7B models:
42
 
43
  To load a specific model revision with HuggingFace, simply add the argument `revision`:
44
  ```bash
45
- import hf_olmo # pip install ai2-olmo
46
- olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-1B", revision="step20000-tokens84B")
 
47
  ```
48
 
49
  All revisions/branches are listed in the file `revisions.txt`.
@@ -93,11 +94,10 @@ pip install ai2-olmo
93
  ```
94
  Now, proceed as usual with HuggingFace:
95
  ```python
96
- import hf_olmo
97
 
98
- from transformers import AutoModelForCausalLM, AutoTokenizer
99
- olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-1B")
100
- tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-1B")
101
  message = ["Language modeling is "]
102
  inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
103
  # optional verifying cuda
@@ -107,17 +107,8 @@ response = olmo.generate(**inputs, max_new_tokens=100, do_sample=True, top_k=50,
107
  print(tokenizer.batch_decode(response, skip_special_tokens=True)[0])
108
  >> 'Language modeling is the first step to build natural language generation...'
109
  ```
110
- Alternatively, with the pipeline abstraction:
111
- ```python
112
- import hf_olmo
113
-
114
- from transformers import pipeline
115
- olmo_pipe = pipeline("text-generation", model="allenai/OLMo-1B")
116
- print(olmo_pipe("Language modeling is "))
117
- >> 'Language modeling is a branch of natural language processing that aims to...'
118
- ```
119
 
120
- Or, you can make this slightly faster by quantizing the model, e.g. `AutoModelForCausalLM.from_pretrained("allenai/OLMo-1B", torch_dtype=torch.float16, load_in_8bit=True)` (requires `bitsandbytes`).
121
  The quantized model is more sensitive to typing / cuda, so it is recommended to pass the inputs as `inputs.input_ids.to('cuda')` to avoid potential issues.
122
 
123
  Note, you may see the following error if `ai2-olmo` is not installed correctly, which is caused by internal Python check naming. We'll update the code soon to make this error clearer.
 
14
 
15
  <!-- Provide a quick summary of what the model is/does. -->
16
 
17
+ **For transformers versions v4.40.0 or newer, we suggest using [OLMo 1B HF](https://huggingface.co/allenai/OLMo-1B-hf) instead.**
18
 
19
  OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
20
  The OLMo models are trained on the [Dolma](https://huggingface.co/datasets/allenai/dolma) dataset.
 
42
 
43
  To load a specific model revision with HuggingFace, simply add the argument `revision`:
44
  ```bash
45
+ from hf_olmo import OLMoForCausalLM # pip install ai2-olmo
46
+
47
+ olmo = OLMoForCausalLM.from_pretrained("allenai/OLMo-1B", revision="step20000-tokens84B")
48
  ```
49
 
50
  All revisions/branches are listed in the file `revisions.txt`.
 
94
  ```
95
  Now, proceed as usual with HuggingFace:
96
  ```python
97
+ from hf_olmo import OLMoForCausalLM, OLMoTokenizerFast
98
 
99
+ olmo = OLMoForCausalLM.from_pretrained("allenai/OLMo-1B")
100
+ tokenizer = OLMoTokenizerFast.from_pretrained("allenai/OLMo-1B")
 
101
  message = ["Language modeling is "]
102
  inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
103
  # optional verifying cuda
 
107
  print(tokenizer.batch_decode(response, skip_special_tokens=True)[0])
108
  >> 'Language modeling is the first step to build natural language generation...'
109
  ```
 
 
 
 
 
 
 
 
 
110
 
111
+ You can make this slightly faster by quantizing the model, e.g. `AutoModelForCausalLM.from_pretrained("allenai/OLMo-1B", torch_dtype=torch.float16, load_in_8bit=True)` (requires `bitsandbytes`).
112
  The quantized model is more sensitive to typing / cuda, so it is recommended to pass the inputs as `inputs.input_ids.to('cuda')` to avoid potential issues.
113
 
114
  Note, you may see the following error if `ai2-olmo` is not installed correctly, which is caused by internal Python check naming. We'll update the code soon to make this error clearer.