Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,39 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
---
|
4 |
+
This is the OpenNMT-py converted version of Mistral 7b Instruct v0.2, 4-bit AWQ quantized (gemm version, faster for large batch sizes).
|
5 |
+
|
6 |
+
The safetensors file is 4.2GB hence runs smoothly on any RTX card.
|
7 |
+
|
8 |
+
Command line to run is:
|
9 |
+
```
|
10 |
+
python onmt/bin/translate.py --config /pathto/mistral-instruct-inference-awq.yaml --src /pathto/input-vicuna.txt --output /pathto/mistral-output.txt
|
11 |
+
```
|
12 |
+
Where for instance, input-vicuna.txt contains:
|
13 |
+
|
14 |
+
USER:⦅newline⦆Show me some attractions in Boston.⦅newline⦆⦅newline⦆ASSISTANT:⦅newline⦆
|
15 |
+
|
16 |
+
Output will be:
|
17 |
+
|
18 |
+
```
|
19 |
+
Absolutely, Boston is rich in history and culture. Here are some must-visit attractions in Boston:⦅newline⦆⦅newline⦆1. Freedom Trail: This 2.5-mile-long path passes through 16 historical sites, including the Paul Revere House, the Old North Church, and the USS Constitution.⦅newline⦆⦅newline⦆2. Boston Common: Established in 1634, Boston Common is the oldest city park in the United States. It covers an area of 50 acres and is home to several monuments, including the Emancipation Monument, the Robert Gould Shaw and the 54th Massachusetts Regiment Memorial, and the Massachusetts Soldiers and Sailors Monument.⦅newline⦆⦅newline⦆3. New England Aquarium: Located on the Central Wharf in the Fort Point Channel, the New England Aquarium is one of the premier visitor attractions in Boston. It covers an area of 23 acres and is home to over 20,000 animals, representing more than 1,200 species. The aquarium is divided into several galleries, including the Giant Ocean Tank, the Caribbean Coral Reef Gallery, the Amazon Rainforest Exhibit, the Sh```
|
20 |
+
|
21 |
+
|
22 |
+
If you run with a batch size of 60 you can get a nice throughput:
|
23 |
+
|
24 |
+
```
|
25 |
+
[2023-12-20 13:17:58,837 INFO] Loading checkpoint from /mnt/InternalCrucial4/dataAI/mistral-7B/mistral-instruct-v0.2/Mistral-7B-instruct-onmt-awq-gemm.pt
|
26 |
+
[2023-12-20 13:17:58,938 INFO] aawq_gemm compression of layer ['w_1', 'w_2', 'w_3', 'linear_values', 'linear_query', 'linear_keys', 'final_linear']
|
27 |
+
[2023-12-20 13:18:02,923 INFO] Loading data into the model
|
28 |
+
step0 time: 1.271669864654541
|
29 |
+
[2023-12-20 13:18:09,028 INFO] PRED SCORE: -0.2166, PRED PPL: 1.24 NB SENTENCES: 59
|
30 |
+
[2023-12-20 13:18:09,028 INFO] Total translation time (s): 5.1
|
31 |
+
[2023-12-20 13:18:09,028 INFO] Average translation time (ms): 87.1
|
32 |
+
[2023-12-20 13:18:09,028 INFO] Tokens per second: 2494.1
|
33 |
+
Time w/o python interpreter load/terminate: 10.200786590576172
|
34 |
+
|
35 |
+
```
|
36 |
+
|
37 |
+
|
38 |
+
|
39 |
+
|