FaisalFehad commited on
Commit
51cce46
·
verified ·
1 Parent(s): 72dd8a4

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +18 -2
README.md CHANGED
@@ -13,9 +13,11 @@ tags:
13
  - quantized
14
  - apple-silicon
15
  - text-generation
 
16
  - agentic
17
  - retrieval
18
  - search
 
19
  - lm-studio
20
  ---
21
 
@@ -23,6 +25,9 @@ tags:
23
 
24
  MLX quantization of [chromadb/context-1](https://huggingface.co/chromadb/context-1) for Apple Silicon.
25
 
 
 
 
26
  ## Key Specs
27
 
28
  | Detail | Value |
@@ -35,6 +40,9 @@ MLX quantization of [chromadb/context-1](https://huggingface.co/chromadb/context
35
  | Attention | Alternating sliding window (128 tokens) + full attention |
36
  | Quantization | 4-bit affine, group size 64 |
37
  | Original Precision | BF16 |
 
 
 
38
 
39
  ## What is Context-1?
40
 
@@ -48,9 +56,17 @@ Key capabilities:
48
 
49
  Performance: comparable to frontier LLMs at a fraction of the cost, up to **10x faster inference**.
50
 
 
 
 
 
 
 
 
 
51
  ## Requirements
52
 
53
- - Apple Silicon Mac with sufficient unified memory
54
  - `mlx-lm >= 0.31.2`
55
 
56
  ```bash
@@ -80,7 +96,7 @@ print(response)
80
 
81
  ### LM Studio
82
 
83
- This model is compatible with [LM Studio](https://lmstudio.ai) on Apple Silicon. Search for `context-1-MLX-4bit` in the model browser.
84
 
85
  ## Important: Agent Harness
86
 
 
13
  - quantized
14
  - apple-silicon
15
  - text-generation
16
+ - conversational
17
  - agentic
18
  - retrieval
19
  - search
20
+ - tool-calling
21
  - lm-studio
22
  ---
23
 
 
25
 
26
  MLX quantization of [chromadb/context-1](https://huggingface.co/chromadb/context-1) for Apple Silicon.
27
 
28
+ - Converted with [mlx-lm](https://github.com/ml-explore/mlx-lm) version 0.31.2
29
+ - Also available: [context-1-MLX-6bit](https://huggingface.co/mlx-community/context-1-MLX-6bit)
30
+
31
  ## Key Specs
32
 
33
  | Detail | Value |
 
40
  | Attention | Alternating sliding window (128 tokens) + full attention |
41
  | Quantization | 4-bit affine, group size 64 |
42
  | Original Precision | BF16 |
43
+ | Disk Size | ~11 GB |
44
+ | Peak Memory | ~12 GB |
45
+ | Chat Template | Supported |
46
 
47
  ## What is Context-1?
48
 
 
56
 
57
  Performance: comparable to frontier LLMs at a fraction of the cost, up to **10x faster inference**.
58
 
59
+ ## Performance on Apple Silicon
60
+
61
+ | Metric | Value |
62
+ |---|---|
63
+ | Prompt Processing | 227 tokens/sec |
64
+ | Generation | 172 tokens/sec |
65
+ | Peak Memory | 12 GB |
66
+
67
  ## Requirements
68
 
69
+ - Apple Silicon Mac with 16GB+ unified memory
70
  - `mlx-lm >= 0.31.2`
71
 
72
  ```bash
 
96
 
97
  ### LM Studio
98
 
99
+ This model is compatible with [LM Studio](https://lmstudio.ai) on Apple Silicon. Search for `context-1-MLX-4bit` in the model browser and download directly.
100
 
101
  ## Important: Agent Harness
102