harshithsaiv commited on
Commit
8ee0ee4
·
verified ·
1 Parent(s): 38e5430

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -1
README.md CHANGED
@@ -1,3 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Per-Head Mixed-Precision KV Cache Compression
2
 
3
  Calibrate once. Pack truly. Same quality.
@@ -199,4 +223,4 @@ Step 3 — Results
199
 
200
  MIT. Free to use, modify, and distribute.
201
 
202
- Built in one week on an A100 SXM4 40GB. Questions, issues, and PRs welcome.
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - Salesforce/wikitext
5
+ language:
6
+ - en
7
+ metrics:
8
+ - perplexity
9
+ base_model:
10
+ - mistralai/Mistral-7B-Instruct-v0.3
11
+ - meta-llama/Meta-Llama-3-8B-Instruct
12
+ tags:
13
+ - quantization
14
+ - kv-cache
15
+ - llm-inference
16
+ - cuda
17
+ - triton
18
+ - memory-efficient
19
+ - mitral
20
+ - llama
21
+ - inference-optimization
22
+ - 4-bit
23
+ - mixed-precision
24
+ ---
25
  # Per-Head Mixed-Precision KV Cache Compression
26
 
27
  Calibrate once. Pack truly. Same quality.
 
223
 
224
  MIT. Free to use, modify, and distribute.
225
 
226
+ Built in one week on an A100 SXM4 40GB. Questions, issues, and PRs welcome.