lucifertrj commited on
Commit
7c964f3
1 Parent(s): 0d13b6d

push model card

Browse files
Files changed (1) hide show
  1. README.md +59 -0
README.md ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ library_name: adapter-transformers
4
+ ---
5
+ Effi-13B AWQ is a quantization model of our [Effi-13B](https://huggingface.co/aiplanet/effi-13b) a reasoning model.
6
+
7
+ ## About AWQ
8
+
9
+ AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference.
10
+
11
+ It is also now supported by continuous batching server vLLM, allowing use of AWQ models for high-throughput concurrent inference in multi-user server scenarios.
12
+
13
+ effi-13B parameters is a causal decoder-only model built by AI Planet based on Llama-2-13b-chat-hf and fine tuned using the 1.8 Million coversations from CoT dataset available in huggingface datasets. The model is made available under the Apache 2.0 license.
14
+
15
+ ## Why use effi-13B-Instruct?
16
+
17
+ - This is a ready to use chat/instruct model based on Llama-2-13b-chat-hf, which provides a rationale for the context provided.
18
+ - Llama-2 is the best open-source model available. This is an instruct model, which may not be ideal for further finetuning. If you are interested in building your own instruct/chat model, we recommend starting from Llama-2-13b-chat-hf
19
+ You will need at least 85-100GB of memory to run inference with effi-13b swiftly.
20
+
21
+ ## Our benchmarking
22
+
23
+ | Metric | Value |
24
+ |--------------------|---------|
25
+ | Perplexity | 5.529 |
26
+ | MMLU | 50.90 |
27
+ | Hella Swag (acc) | 59.38 |
28
+ | Hella Swag (acc_norm) | 78.91 |
29
+ | TruthfulQA | 38.24 |
30
+
31
+ ## Direct Use
32
+
33
+ effi-13b has been finetuned on a Chain of Thought dataset.
34
+
35
+ ## Out-of-Scope Use
36
+
37
+ Production use without adequate assessment of risks and mitigation; any use cases which may be considered irresponsible or harmful.
38
+
39
+ ## Bias, Risks, and Limitations
40
+
41
+ This model has been majorly trained on English data, and will not generalize appropriately to other languages. Furthermore, as it is trained on a large-scale corpora representative of the web, it will carry the stereotypes and biases commonly encountered online.
42
+
43
+ ## Recommendations
44
+
45
+ We recommend users of effi-13b to develop guardrails and take appropriate precautions for any production use.
46
+
47
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information is needed for further recommendations.
48
+
49
+ ## Citations
50
+
51
+ ```
52
+ @misc {lucifertrj,
53
+ author = { {Tarun Jain} },
54
+ title = { Effi-13B-AWQ by AI Planet},
55
+ year = 2024,
56
+ url = { https://huggingface.co/aiplanet/effi-13B-AWQ/ },
57
+ publisher = { Hugging Face }
58
+ }
59
+ ```