sharpenb commited on
Commit
eb3409f
·
verified ·
1 Parent(s): 958a775

d9a7797220f58c53eabc7e2d39f3ac00e3b49b98d8e70a9c69ae27e1b937e230

Browse files
Files changed (3) hide show
  1. README.md +5 -5
  2. config.json +1 -1
  3. smash_config.json +1 -1
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  thumbnail: "https://assets-global.website-files.com/646b351987a8d8ce158d1940/64ec9e96b4334c0e1ac41504_Logo%20with%20white%20text.svg"
3
- base_model: ORIGINAL_REPO_NAME
4
  metrics:
5
  - memory_disk
6
  - memory_inference
@@ -52,7 +52,7 @@ tags:
52
 
53
  You can run the smashed model with these steps:
54
 
55
- 0. Check requirements from the original repo ORIGINAL_REPO_NAME installed. In particular, check python, cuda, and transformers versions.
56
  1. Make sure that you have installed quantization related packages.
57
  ```bash
58
  pip install transformers accelerate bitsandbytes>0.37.0
@@ -63,7 +63,7 @@ You can run the smashed model with these steps:
63
 
64
 
65
  model = AutoModelForCausalLM.from_pretrained("PrunaAI/neeleshg23-jamba-1.9b-7-bnb-8bit-smashed", trust_remote_code=True, device_map='auto')
66
- tokenizer = AutoTokenizer.from_pretrained("ORIGINAL_REPO_NAME")
67
 
68
  input_ids = tokenizer("What is the color of prunes?,", return_tensors='pt').to(model.device)["input_ids"]
69
 
@@ -77,9 +77,9 @@ The configuration info are in `smash_config.json`.
77
 
78
  ## Credits & License
79
 
80
- The license of the smashed model follows the license of the original model. Please check the license of the original model ORIGINAL_REPO_NAME before using this model which provided the base model. The license of the `pruna-engine` is [here](https://pypi.org/project/pruna-engine/) on Pypi.
81
 
82
  ## Want to compress other models?
83
 
84
  - Contact us and tell us which model to compress next [here](https://www.pruna.ai/contact).
85
- - Request access to easily compress your own AI models [here](https://z0halsaff74.typeform.com/pruna-access?typeform-source=www.pruna.ai).
 
1
  ---
2
  thumbnail: "https://assets-global.website-files.com/646b351987a8d8ce158d1940/64ec9e96b4334c0e1ac41504_Logo%20with%20white%20text.svg"
3
+ base_model: neeleshg23/jamba-1.9b-7
4
  metrics:
5
  - memory_disk
6
  - memory_inference
 
52
 
53
  You can run the smashed model with these steps:
54
 
55
+ 0. Check requirements from the original repo neeleshg23/jamba-1.9b-7 installed. In particular, check python, cuda, and transformers versions.
56
  1. Make sure that you have installed quantization related packages.
57
  ```bash
58
  pip install transformers accelerate bitsandbytes>0.37.0
 
63
 
64
 
65
  model = AutoModelForCausalLM.from_pretrained("PrunaAI/neeleshg23-jamba-1.9b-7-bnb-8bit-smashed", trust_remote_code=True, device_map='auto')
66
+ tokenizer = AutoTokenizer.from_pretrained("neeleshg23/jamba-1.9b-7")
67
 
68
  input_ids = tokenizer("What is the color of prunes?,", return_tensors='pt').to(model.device)["input_ids"]
69
 
 
77
 
78
  ## Credits & License
79
 
80
+ The license of the smashed model follows the license of the original model. Please check the license of the original model neeleshg23/jamba-1.9b-7 before using this model which provided the base model. The license of the `pruna-engine` is [here](https://pypi.org/project/pruna-engine/) on Pypi.
81
 
82
  ## Want to compress other models?
83
 
84
  - Contact us and tell us which model to compress next [here](https://www.pruna.ai/contact).
85
+ - Do it by yourself [here](https://docs.pruna.ai/en/latest/setup/pip.html).
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "/covalent/.cache/models/tmp0n_hx_3e9y8ksise",
3
  "architectures": [
4
  "JambaForCausalLM"
5
  ],
 
1
  {
2
+ "_name_or_path": "/covalent/.cache/models/tmp_52zyzai_lzq9dm2",
3
  "architectures": [
4
  "JambaForCausalLM"
5
  ],
smash_config.json CHANGED
@@ -28,7 +28,7 @@
28
  "quant_llm-int8_weight_bits": 8,
29
  "max_batch_size": 1,
30
  "device": "cuda",
31
- "cache_dir": "/covalent/.cache/models/tmp0n_hx_3e",
32
  "task": "",
33
  "save_load_fn": "bitsandbytes",
34
  "save_load_fn_args": {}
 
28
  "quant_llm-int8_weight_bits": 8,
29
  "max_batch_size": 1,
30
  "device": "cuda",
31
+ "cache_dir": "/covalent/.cache/models/tmp_52zyzai",
32
  "task": "",
33
  "save_load_fn": "bitsandbytes",
34
  "save_load_fn_args": {}