isemmanuelolowe commited on
Commit
92bfd68
1 Parent(s): 989f2d3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -0
README.md CHANGED
@@ -1,3 +1,46 @@
1
  ---
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+
5
+ # Jamba 4xMoe (Slerp Merge)
6
+
7
+ This model has been merged from [Jamba](https://huggingface.co/ai21labs/Jamba-v0.1) a 52B parameter model with 16 experts. It used an accumulative SLERP to merge experts from 16 to 4.
8
+
9
+
10
+ 4 Bit Inference Code
11
+ ```python
12
+ from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
13
+ import torch
14
+
15
+ model_id = "isemmanuelolowe/Jamba-4xMoE_slerp"
16
+
17
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
18
+ quantization_config = BitsAndBytesConfig(
19
+ load_in_4bit=True,
20
+ # load_in_8bit=True,
21
+ bnb_4bit_quant_type="nf4",
22
+ bnb_4bit_compute_dtype=torch.bfloat16,
23
+ bnb_4bit_use_double_quant=True,
24
+ llm_int8_skip_modules=["mamba"],
25
+ )
26
+
27
+ model = AutoModelForCausalLM.from_pretrained(
28
+ model_id,
29
+ trust_remote_code=True,
30
+ torch_dtype=torch.bfloat16,
31
+ attn_implementation="flash_attention_2",
32
+ quantization_config=quantization_config
33
+ )
34
+
35
+ input_ids = tokenizer("Hi, how are you?", return_tensors="pt")["input_ids"].to("cuda")
36
+
37
+ out = model.generate(input_ids, max_new_tokens=256, temperature=0, repetition_penalty=1.2)
38
+ print(tokenizer.batch_decode(out, skip_special_tokens=True))
39
+ ```
40
+
41
+ OUTPUT:
42
+ Here is how to do bubble sort
43
+ ```bash
44
+ ["Hi, how are you?\n\nHello. I am a 20-year old female and in my prime of life. And the other day I was told that I have been on this site for over than three years now. That is why I can be here to help others with their issues or concerns about themselves as well. It's not just me who has done it all these days without any reason whatsoever! So least say something good about yourself too: Because there exists no point at which anyone would want anything else from us except our own self esteem being restored again soon enough so we could get back into things properly once more before starting up another new chapter somewhere far away where nobody knows what happens next after each passing second until finally coming full force against reality itself whence already having taken place long ago but only because one person had gone ahead first thing along making sure everything went according due diligence beforehand rather then letting someone else do his/her job instead later down line if he didn't know much better himself yet still doing nothing right way around anyway since always trying hard enough even though never actually knowing exactly whether whomsoever did indeed come across him during course workday routine checkups every single hour throughout weeklong duration period wise enough considering carefully enough times between both sides equally balanced consideration given proper"]```
45
+
46
+ A chat lora adapter is availabe for this model [here.](https://huggingface.co/isemmanuelolowe/jamba_chat_4MoE_8k)