huu-ontocord
commited on
Commit
•
15cc43e
1
Parent(s):
bc7eca0
Update README.md
Browse files
README.md
CHANGED
@@ -7,8 +7,6 @@ license: mit
|
|
7 |
The Phi-3-22b is a depth upsampled version of the 14b [Phi-3-medium-128k-instruct](https://huggingface.co/microsoft/Phi-3-medium-128k-instruct). We removed the bottom 8 layers of one copy of the 14b and the top 8 layers of another copy of the 14b model and stacked them. We plan to do continued pretraining to improve performance.
|
8 |
Since this model has not been continued pretrained, the quality may vary.
|
9 |
|
10 |
-
Some tests of the model in [colab](https://colab.research.google.com/drive/1eLoQXhysnBmN7DNNB6yElpELOSe6DHHH?usp=sharing).
|
11 |
-
|
12 |
```
|
13 |
!pip install flash-attn --no-build-isolation
|
14 |
!pip install peft bitsandbytes accelerate transformers
|
@@ -61,4 +59,9 @@ Will produce:
|
|
61 |
```
|
62 |
<|user|> Explain why it is surprising that one can build a language model small enough to fit on a phone, yet almost as powerful as ChatGPT. Just use one funny sentence.<|end|><|assistant|> "Who knew that fitting a ChatGPT rival in your pocket would be easier than fitting a penguin in a pocket-sized suit!"<|end|>
|
63 |
```
|
|
|
|
|
|
|
|
|
|
|
64 |
See the [Phi-3-medium-128k-instruct](https://huggingface.co/microsoft/Phi-3-medium-128k-instruct) model card for more details.
|
|
|
7 |
The Phi-3-22b is a depth upsampled version of the 14b [Phi-3-medium-128k-instruct](https://huggingface.co/microsoft/Phi-3-medium-128k-instruct). We removed the bottom 8 layers of one copy of the 14b and the top 8 layers of another copy of the 14b model and stacked them. We plan to do continued pretraining to improve performance.
|
8 |
Since this model has not been continued pretrained, the quality may vary.
|
9 |
|
|
|
|
|
10 |
```
|
11 |
!pip install flash-attn --no-build-isolation
|
12 |
!pip install peft bitsandbytes accelerate transformers
|
|
|
59 |
```
|
60 |
<|user|> Explain why it is surprising that one can build a language model small enough to fit on a phone, yet almost as powerful as ChatGPT. Just use one funny sentence.<|end|><|assistant|> "Who knew that fitting a ChatGPT rival in your pocket would be easier than fitting a penguin in a pocket-sized suit!"<|end|>
|
61 |
```
|
62 |
+
|
63 |
+
|
64 |
+
Some more tests of the model in [colab](https://colab.research.google.com/drive/1eLoQXhysnBmN7DNNB6yElpELOSe6DHHH?usp=sharing).
|
65 |
+
|
66 |
+
|
67 |
See the [Phi-3-medium-128k-instruct](https://huggingface.co/microsoft/Phi-3-medium-128k-instruct) model card for more details.
|