duyntnet commited on
Commit
180087c
1 Parent(s): 5347a54

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +99 -0
README.md ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ language:
4
+ - en
5
+ pipeline_tag: text-generation
6
+ inference: false
7
+ tags:
8
+ - transformers
9
+ - gguf
10
+ - imatrix
11
+ - Phi-3.5-mini-instruct
12
+ ---
13
+ Quantizations of https://huggingface.co/microsoft/Phi-3.5-mini-instruct
14
+
15
+
16
+ ### Inference Clients/UIs
17
+ * [llama.cpp](https://github.com/ggerganov/llama.cpp)
18
+ * [JanAI](https://github.com/janhq/jan)
19
+ * [KoboldCPP](https://github.com/LostRuins/koboldcpp)
20
+ * [text-generation-webui](https://github.com/oobabooga/text-generation-webui)
21
+ * [ollama](https://github.com/ollama/ollama)
22
+ * [GPT4All](https://github.com/nomic-ai/gpt4all)
23
+
24
+ ---
25
+
26
+ # From original readme
27
+
28
+ Phi-3.5-mini is a lightweight, state-of-the-art open model built upon datasets used for Phi-3 - synthetic data and filtered publicly available websites - with a focus on very high-quality, reasoning dense data. The model belongs to the Phi-3 model family and supports 128K token context length. The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning, proximal policy optimization, and direct preference optimization to ensure precise instruction adherence and robust safety measures.
29
+
30
+ ## Usage
31
+
32
+ ### Requirements
33
+ Phi-3 family has been integrated in the `4.43.0` version of `transformers`. The current `transformers` version can be verified with: `pip list | grep transformers`.
34
+
35
+ Examples of required packages:
36
+ ```
37
+ flash_attn==2.5.8
38
+ torch==2.3.1
39
+ accelerate==0.31.0
40
+ transformers==4.43.0
41
+ ```
42
+
43
+ Phi-3.5-mini-instruct is also available in [Azure AI Studio](https://aka.ms/try-phi3.5mini)
44
+
45
+ ### Tokenizer
46
+
47
+ Phi-3.5-mini-Instruct supports a vocabulary size of up to `32064` tokens. The [tokenizer files](https://huggingface.co/microsoft/Phi-3.5-mini-instruct/blob/main/added_tokens.json) already provide placeholder tokens that can be used for downstream fine-tuning, but they can also be extended up to the model's vocabulary size.
48
+
49
+ ### Input Formats
50
+ Given the nature of the training data, the Phi-3.5-mini-instruct model is best suited for prompts using the chat format as follows:
51
+
52
+ ```
53
+ <|system|>
54
+ You are a helpful assistant.<|end|>
55
+ <|user|>
56
+ How to explain Internet for a medieval knight?<|end|>
57
+ <|assistant|>
58
+ ```
59
+
60
+ ### Loading the model locally
61
+ After obtaining the Phi-3.5-mini-instruct model checkpoint, users can use this sample code for inference.
62
+
63
+ ```python
64
+ import torch
65
+ from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
66
+
67
+ torch.random.manual_seed(0)
68
+
69
+ model = AutoModelForCausalLM.from_pretrained(
70
+ "microsoft/Phi-3.5-mini-instruct",
71
+ device_map="cuda",
72
+ torch_dtype="auto",
73
+ trust_remote_code=True,
74
+ )
75
+ tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3.5-mini-instruct")
76
+
77
+ messages = [
78
+ {"role": "system", "content": "You are a helpful AI assistant."},
79
+ {"role": "user", "content": "Can you provide ways to eat combinations of bananas and dragonfruits?"},
80
+ {"role": "assistant", "content": "Sure! Here are some ways to eat bananas and dragonfruits together: 1. Banana and dragonfruit smoothie: Blend bananas and dragonfruits together with some milk and honey. 2. Banana and dragonfruit salad: Mix sliced bananas and dragonfruits together with some lemon juice and honey."},
81
+ {"role": "user", "content": "What about solving an 2x + 3 = 7 equation?"},
82
+ ]
83
+
84
+ pipe = pipeline(
85
+ "text-generation",
86
+ model=model,
87
+ tokenizer=tokenizer,
88
+ )
89
+
90
+ generation_args = {
91
+ "max_new_tokens": 500,
92
+ "return_full_text": False,
93
+ "temperature": 0.0,
94
+ "do_sample": False,
95
+ }
96
+
97
+ output = pipe(messages, **generation_args)
98
+ print(output[0]['generated_text'])
99
+ ```