gugarosa commited on
Commit
304b058
1 Parent(s): 654b690

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -2
README.md CHANGED
@@ -31,13 +31,34 @@ def print_prime(n):
31
  ```
32
  where the model generates the code after the comments. (Note: This is a legitimate and correct use of the else statement in Python loops.)
33
 
34
- **Notes**
35
  * Phi-1 is intended for research purposes. The model-generated code should be treated as a starting point rather than a definitive solution for potential use cases. Users should be cautious when employing this model in their applications.
36
  * Direct adoption for production coding tasks is out of the scope of this research project. As a result, Phi-1 has not been tested to ensure that it performs adequately for production-level code. Please refer to the limitation sections of this document for more details.
37
  * If you are using `transformers>=4.36.0`, always load the model with `trust_remote_code=True` to prevent side-effects.
38
 
39
  ## Sample Code
40
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
  ```python
42
  import torch
43
  from transformers import AutoModelForCausalLM, AutoTokenizer
@@ -57,7 +78,7 @@ text = tokenizer.batch_decode(outputs)[0]
57
  print(text)
58
  ```
59
 
60
- **Remark.** In the generation function, our model currently does not support beam search (`num_beams > 1`).
61
  Furthermore, in the forward pass of the model, we currently do not support outputting hidden states or attention values, or using custom input embeddings.
62
 
63
  ## Limitations of Phi-1
 
31
  ```
32
  where the model generates the code after the comments. (Note: This is a legitimate and correct use of the else statement in Python loops.)
33
 
34
+ **Notes:**
35
  * Phi-1 is intended for research purposes. The model-generated code should be treated as a starting point rather than a definitive solution for potential use cases. Users should be cautious when employing this model in their applications.
36
  * Direct adoption for production coding tasks is out of the scope of this research project. As a result, Phi-1 has not been tested to ensure that it performs adequately for production-level code. Please refer to the limitation sections of this document for more details.
37
  * If you are using `transformers>=4.36.0`, always load the model with `trust_remote_code=True` to prevent side-effects.
38
 
39
  ## Sample Code
40
 
41
+ There are four types of execution mode:
42
+
43
+ 1. FP16 / Flash-Attention / CUDA:
44
+ ```python
45
+ model = AutoModelForCausalLM.from_pretrained("microsoft/phi-1", torch_dtype="auto", flash_attn=True, flash_rotary=True, fused_dense=True, device_map="cuda", trust_remote_code=True)
46
+ ```
47
+ 2. FP16 / CUDA:
48
+ ```python
49
+ model = AutoModelForCausalLM.from_pretrained("microsoft/phi-1", torch_dtype="auto", device_map="cuda", trust_remote_code=True)
50
+ ```
51
+ 3. FP32 / CUDA:
52
+ ```python
53
+ model = AutoModelForCausalLM.from_pretrained("microsoft/phi-1", torch_dtype=torch.float32, device_map="cuda", trust_remote_code=True)
54
+ ```
55
+ 4. FP32 / CPU:
56
+ ```python
57
+ model = AutoModelForCausalLM.from_pretrained("microsoft/phi-1", torch_dtype=torch.float32, device_map="cpu", trust_remote_code=True)
58
+ ```
59
+
60
+ To ensure the maximum compatibility, we recommend using the second execution mode (FP16 / CUDA), as follows:
61
+
62
  ```python
63
  import torch
64
  from transformers import AutoModelForCausalLM, AutoTokenizer
 
78
  print(text)
79
  ```
80
 
81
+ **Remark:** In the generation function, our model currently does not support beam search (`num_beams > 1`).
82
  Furthermore, in the forward pass of the model, we currently do not support outputting hidden states or attention values, or using custom input embeddings.
83
 
84
  ## Limitations of Phi-1