|
--- |
|
license: mit |
|
--- |
|
|
|
## Model Summary |
|
|
|
The Phi-3-22b is a depth upsampled version of the 14b [Phi-3-medium-128k-instruct](https://huggingface.co/microsoft/Phi-3-medium-128k-instruct). We removed the bottom 8 layers of one copy of the 14b and the top 8 layers of another copy of the 14b model and stacked them. We plan to do continued pretraining to improve performance. |
|
Since this model has not been continued pretrained, the quality may vary. |
|
|
|
A [GGUF version](https://huggingface.co/mradermacher/phi-3-22b-GGUF) thanks to @mradermacher! |
|
|
|
Loading the model: |
|
``` |
|
!pip install flash-attn --no-build-isolation |
|
!pip install peft bitsandbytes accelerate transformers |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
import torch |
|
tokenizer = AutoTokenizer.from_pretrained("ontocord/phi-3-22b", trust_remote_code=True) |
|
model = AutoModelForCausalLM.from_pretrained("ontocord/phi-3-22b", |
|
torch_dtype="auto", device_map="auto", trust_remote_code=True, ) |
|
|
|
``` |
|
Basic test |
|
``` |
|
with torch.no_grad(): |
|
print(tokenizer.batch_decode(model.generate(**tokenizer("<|user|>\nHow to explain Internet for a medieval knight?<|end|>\n<|assistant|>\n", return_tensors="pt").to('cuda'), max_new_tokens=128), use_cache=True)[0]) |
|
``` |
|
Will produce: |
|
``` |
|
<|user|> How to explain Internet for a medieval knight?<|end|><|assistant|> Ah, noble knight, let me attempt to explain this mystical realm known as the Internet in terms that might resonate with your medieval understanding. |
|
|
|
Imagine, if you will, a vast kingdom stretching beyond the horizon, where countless villages, towns, and cities are connected by a network of roads, bridges, and pathways. This kingdom is not bound by physical borders, but instead, it exists in a realm beyond our own, accessible only through magical devices known as computers, tablets, and smartphs. |
|
|
|
In this kingdom, information flows like a mighty river,... |
|
``` |
|
|
|
To run on a Colab T4, try 4-bit |
|
``` |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
import torch |
|
tokenizer = AutoTokenizer.from_pretrained("ontocord/phi-3-22b", trust_remote_code=True) |
|
model = AutoModelForCausalLM.from_pretrained("ontocord/phi-3-22b", |
|
load_in_4bit=True, device_map="auto", trust_remote_code=True, ) |
|
with torch.no_grad(): |
|
print(tokenizer.batch_decode(model.generate(**tokenizer("<|user|>\nHow to explain Internet for a medieval knight?<|end|>\n<|assistant|>\n", return_tensors="pt").to('cuda'), max_new_tokens=128), use_cache=True)[0]) |
|
|
|
``` |
|
|
|
Will produce: |
|
``` |
|
<|user|> How to explain Internet for a medieval knight?<|end|><|assistant|> Ah, noble knight, let me attempt to explain this mystical network known as the Internet, using terms and analogies from your time. |
|
|
|
Imagine a vast kingdom, stretching far beyond the horizon, where countless villages, towns, and cities are connected by roads, rivers, and paths. Each village is like a castle, filled with people who share knowledge, goods, stories, and news. |
|
|
|
Now, imagine that instead of messengers, horses, or ships, there exists a magical network of invisible threads connecting all these villages. This network is invisible to the eye, yet it allows messages, scroll |
|
``` |
|
|
|
``` |
|
import torch |
|
with torch.no_grad(): |
|
print(tokenizer.batch_decode(model.generate(**tokenizer("<|user|>\nExplain why it is surprising that one can build a language model small enough to fit on a phone, yet almost as powerful as ChatGPT. Just use one funny sentence.<|end|>\n<|assistant|>\n", return_tensors="pt").to('cuda'), max_new_tokens=128), use_cache=True)[0]) |
|
``` |
|
Will produce: |
|
``` |
|
<|user|> Explain why it is surprising that one can build a language model small enough to fit on a phone, yet almost as powerful as ChatGPT. Just use one funny sentence.<|end|><|assistant|> "Who knew that fitting a ChatGPT rival in your pocket would be easier than fitting a penguin in a pocket-sized suit!"<|end|> |
|
``` |
|
|
|
|
|
Some harder reasoning tests of the model in [colab](https://colab.research.google.com/drive/1eLoQXhysnBmN7DNNB6yElpELOSe6DHHH?usp=sharing). |
|
|
|
|
|
See the [Phi-3-medium-128k-instruct](https://huggingface.co/microsoft/Phi-3-medium-128k-instruct) model card for more details. |