lxyuan commited on
Commit
69a55c5
1 Parent(s): 776ece8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +65 -1
README.md CHANGED
@@ -86,4 +86,68 @@ print(outputs[0]["generated_text"])
86
  - He played a significant role in Singapore's rapid development, transforming the country from a poor and undeveloped nation into a modern and prosperous city-state.
87
  - Lee passed away in 2015, at the age of 91.
88
  - He was widely regarded as one of the most influential leaders of the 20th century and a key figure in the history of Singapore.
89
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
86
  - He played a significant role in Singapore's rapid development, transforming the country from a poor and undeveloped nation into a modern and prosperous city-state.
87
  - Lee passed away in 2015, at the age of 91.
88
  - He was widely regarded as one of the most influential leaders of the 20th century and a key figure in the history of Singapore.
89
+ ```
90
+
91
+ ### 4-bit Inferencing Example
92
+
93
+ ```python
94
+ from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
95
+ import transformers
96
+ import torch
97
+
98
+ #!nvidia-smi
99
+
100
+ """
101
+ Wed Feb 7 12:51:07 2024
102
+ +---------------------------------------------------------------------------------------+
103
+ | NVIDIA-SMI 535.154.05 Driver Version: 535.154.05 CUDA Version: 12.2 |
104
+ |-----------------------------------------+----------------------+----------------------+
105
+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
106
+ | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
107
+ | | | MIG M. |
108
+ |=========================================+======================+======================|
109
+ | 0 Tesla V100-SXM2-16GB On | 00000000:00:1E.0 Off | 0 |
110
+ | N/A 41C P0 44W / 300W | 4950MiB / 16384MiB | 0% Default |
111
+ | | | N/A |
112
+ +-----------------------------------------+----------------------+----------------------+
113
+ """
114
+
115
+ model_id = "lxyuan/AeolusBlend-7B-slerp"
116
+
117
+ bnb_config = BitsAndBytesConfig(
118
+ load_in_4bit=True,
119
+ bnb_4bit_use_double_quant=True,
120
+ bnb_4bit_quant_type="nf4",
121
+ bnb_4bit_compute_dtype=torch.bfloat16
122
+ )
123
+
124
+ model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map="auto")
125
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
126
+
127
+ pipeline = transformers.pipeline(
128
+ "text-generation",
129
+ model=model,
130
+ tokenizer=tokenizer,
131
+ device_map="auto",
132
+ )
133
+
134
+ messages = [{"role": "user", "content": "What is a large language model?"}]
135
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
136
+
137
+ outputs = pipeline(prompt, max_new_tokens=2048, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
138
+
139
+ print(outputs[0]["generated_text"])
140
+
141
+ >>>
142
+ <s>[INST] What is a large language model? [/INST]
143
+
144
+ A large language model is a type of artificial intelligence system that has been trained on vast amounts of
145
+ text data, enabling it to generate human-like responses to a wide range of written prompts. These models are
146
+ designed to learn the patterns and rules of language, and as a result, they can perform various natural
147
+ language processing tasks, such as translation, summarization, and question-answering, with a high degree
148
+ of accuracy. Large language models are typically powered by deep learning algorithms and can have billions
149
+ or trillions of parameters, making them capable of processing and understanding complex language structures
150
+ and nuances. Some well-known examples of large language models include GPT-3, BERT, and T5.
151
+ ```
152
+
153
+ - 4bit Inference Example notebook can be found [here](https://github.com/LxYuan0420/nlp/blob/main/notebooks/Inference_4bit_AeolusBlend.ipynb)