ldilov commited on
Commit
f5431f7
1 Parent(s): 02aa2f6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -3
README.md CHANGED
@@ -1,3 +1,52 @@
1
- ---
2
- license: openrail
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Model Card: stablelm-tuned-alpha-7b-4bit-128g
2
+
3
+ ### Description
4
+
5
+ The stablelm-tuned-alpha-7b-4bit-128g model is a quantized version of the stablelm-tuned-alpha-7b language model. It is based on the GPTNeoX architecture and has been optimized using the AutoGPTQ framework. The model has been specifically trained and fine-tuned for generating conversational responses.
6
+
7
+ The quantization process of this model reduces the memory footprint and improves inference efficiency while maintaining a high level of performance. It uses 4-bit quantization with a group size of 128, enabling efficient representation of model parameters. The dampening factor (damp_percent) is set to 0.01, which controls the quantization error.
8
+
9
+ ### Model Details
10
+
11
+ - Model Name: stablelm-tuned-alpha-7b-4bit-128g
12
+ - Base Model: stablelm-tuned-alpha-7b
13
+ - Quantization Configuration:
14
+ - Bits: 4
15
+ - Group Size: 128
16
+ - Damp Percent: 0.01
17
+ - Descending Activation Quantization (desc_act): Enabled
18
+ - Symmetric Quantization (sym): Enabled
19
+ - True Sequential Quantization (true_sequential): Enabled
20
+
21
+ ### Usage
22
+
23
+ The stablelm-tuned-alpha-7b-4bit-128g model can be used for a variety of conversational tasks such as chatbots, question answering systems, and dialogue generation. It can generate human-like responses based on given system prompts, contexts, and input texts.
24
+
25
+ To use the model, provide a system prompt, context, and input text in the following format:
26
+
27
+ Input: {system_prompt}\n{context}: {text}
28
+ Label: {response}
29
+
30
+ Make sure to tokenize the inputs using the original tokenizer before passing them to the model. Use the official model's template for system prompt and user prompt format.
31
+
32
+ ### Performance
33
+
34
+ - Model Size: 5GB
35
+ - Inference Speed: N/A
36
+ - Accuracy: N/A
37
+
38
+ ### Limitations and Considerations
39
+
40
+ - As a language model, the stablelm-tuned-alpha-7b-4bit-128g model relies on the quality and relevance of the training data. It may generate responses that are contextually appropriate but might not always be factually accurate or suitable for all scenarios.
41
+ - Quantization introduces a trade-off between model size, memory efficiency, and precision. Although the model has been optimized for performance, there might be a slight reduction in the quality of generated responses compared to the original model.
42
+ - The model may not have been trained on specific domain-specific data and may not perform optimally for specialized tasks.
43
+
44
+ ### Acknowledgments
45
+
46
+ The stablelm-tuned-alpha-7b-4bit-128g model is developed by StabilityAI, leveraging the GPTNeoX architecture and the AutoGPTQ framework. It builds upon the research and contributions from the open-source community in the field of language modeling and conversational AI.
47
+
48
+ ### License
49
+
50
+ The stablelm-tuned-alpha-7b-4bit-128g model is released under the [license terms](https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_GB) specified by StabilityAI.
51
+ Quantized by Lazar Dilov [github](https://github.com/ldilov/IntelliBridge)
52
+ Used framework created by [github](https://github.com/PanQiWei/)