adminguhantech commited on
Commit
7f6682a
Β·
verified Β·
1 Parent(s): 2c89582

Polish model card: lead with Cipher features, credit Qwen as foundation

Browse files
Files changed (1) hide show
  1. README.md +107 -42
README.md CHANGED
@@ -11,89 +11,154 @@ tags:
11
  - gguf
12
  - ciphercode
13
  - vscode
 
14
  library_name: gguf
15
  ---
16
 
17
  # CipherModel-1.5B
18
 
19
- > **The model behind CipherCodeβ„’ β€” the AI coding assistant that writes code the way YOU would.**
20
- > Closed-beta v0.1, by **Lila AI LLC**.
 
 
21
 
22
- This repository hosts the GGUF Q4_K_M quantization served by the [CipherCode VS Code extension](https://github.com/lila-ai-llc/ciphercode-vscode) (closed beta). It is built on top of [Qwen/Qwen2.5-Coder-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct) and is suitable for inline code completion, refactor / explain / fix / docstring tasks, and short conversational coding chat.
23
 
24
- ## What's in this repo
25
 
26
- | File | Size | Format |
27
- |---|---|---|
28
- | `CipherModel-1.5B-Q4_K_M.gguf` | ~1.07 GB | GGUF Q4_K_M (llama.cpp) |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
 
30
- ## What this is
 
 
 
 
31
 
32
- - **A redistribution of `Qwen2.5-Coder-1.5B-Instruct` in GGUF Q4_K_M format**, branded as CipherModel-1.5B for use in the CipherCode extension's closed beta.
33
- - **No fine-tuning has been applied yet at v0.1.** The "Cipher Persona" style adaptation that ships with CipherCode operates entirely at the system-prompt level, injecting the developer's detected style into every request β€” model weights are unchanged from base Qwen.
34
- - A future v0.2+ release of this repo will contain a true LoRA fine-tune merged into the base.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
 
36
- ## Usage
37
 
38
- ### Via the CipherCode VS Code extension (recommended)
 
 
39
 
40
  ```bash
41
- # Friends of Lila AI: install the .vsix sent to you privately
42
  code --install-extension ciphercode-0.1.0.vsix
43
  ```
44
 
45
- The extension talks to a private Cloud Run endpoint that serves this model via `llama-server`. End users of the extension never need to download this GGUF themselves.
46
 
47
- ### Direct with llama.cpp
48
 
49
  ```bash
50
- # Download the GGUF
51
- huggingface-cli download guhantech/CipherModel-1.5B CipherModel-1.5B-Q4_K_M.gguf --local-dir .
 
52
 
53
- # Run llama-server
54
  llama-server \
55
  -m CipherModel-1.5B-Q4_K_M.gguf \
56
  --host 0.0.0.0 --port 8080 \
57
  --ctx-size 4096 -np 5
58
 
59
- # Hit it
60
  curl -X POST http://localhost:8080/v1/chat/completions \
61
  -H "Content-Type: application/json" \
62
- -d '{"model":"cipher-model","messages":[{"role":"user","content":"write a python fizzbuzz"}],"max_tokens":256}'
 
 
 
 
63
  ```
64
 
65
- ### Direct with `llama-cpp-python`
66
 
67
  ```python
68
  from llama_cpp import Llama
 
69
  llm = Llama(model_path="CipherModel-1.5B-Q4_K_M.gguf", n_ctx=4096)
70
  out = llm("def fizzbuzz(n):", max_tokens=256)
71
  print(out["choices"][0]["text"])
72
  ```
73
 
74
- ## Specifications
75
-
76
- - **Architecture:** Qwen2.5-Coder (transformer)
77
- - **Parameters:** 1.5 B
78
- - **Context window:** 32 K (we run at 4 K in production for memory)
79
- - **Quantization:** Q4_K_M
80
- - **License:** Apache 2.0 (inherited from base model)
81
- - **Languages supported:** strong in Python, JavaScript, TypeScript, Java, Go, Rust, C/C++ β€” see Qwen2.5-Coder's eval table for details
82
 
83
- ## Limitations
84
-
85
- - Quality is meaningfully lower than Qwen-Coder-7B / 32B. For complex multi-file reasoning or long-context tasks, prefer the larger sizes.
86
- - Q4_K_M trades ~1–2% quality for ~4Γ— smaller size vs full fp16. Acceptable for autocomplete and single-file tasks.
87
- - This is a closed-beta artifact; no SLAs, no support guarantees.
88
-
89
- ## Citation / credits
90
 
91
- Built on top of:
92
 
93
  ```bibtex
94
  @article{hui2024qwen2,
95
  title={Qwen2.5-Coder Technical Report},
96
- author={Binyuan Hui and Jian Yang and Zeyu Cui and Jiaxi Yang and Dayiheng Liu and Lei Zhang and Tianyu Liu and Jiajun Zhang and Bowen Yu and Keming Lu and Kai Dang and Yang Fan and Yichang Zhang and An Yang and Rui Men and Fei Huang and Bo Zheng and Yibo Miao and Shanghaoran Quan and Yunlong Feng and Xingzhang Ren and Xuancheng Ren and Jingren Zhou and Junyang Lin},
97
  journal={arXiv preprint arXiv:2409.12186},
98
  year={2024}
99
  }
@@ -101,10 +166,10 @@ Built on top of:
101
 
102
  ## Trademark
103
 
104
- CipherCodeβ„’ and Cipher Personaβ„’ are trademarks of **Lila AI LLC**. All rights reserved.
105
 
106
- The CipherModel weights themselves are released under Apache 2.0 (inherited from Qwen). The trademarks restrict only how you may name and brand derivative work β€” the underlying weights are free to use.
107
 
108
  ---
109
 
110
- Β© 2026 Lila AI LLC.
 
11
  - gguf
12
  - ciphercode
13
  - vscode
14
+ - developer-tools
15
  library_name: gguf
16
  ---
17
 
18
  # CipherModel-1.5B
19
 
20
+ > **Your IDE's new best friend.**
21
+ > The model behind [CipherCode](https://huggingface.co/guhantech) β€” the AI coding assistant that learns *your* style, remembers *your* projects, and writes code in *your* voice.
22
+ >
23
+ > By **Lila AI LLC** Β· Closed beta v0.1
24
 
25
+ ---
26
 
27
+ ## What CipherCode Delivers
28
 
29
+ CipherCode isn't another generic completion plugin. It's a complete coding companion that lives natively inside VS Code and adapts to *you*.
30
+
31
+ ### Cipher Persona β€” Your Style, Learned
32
+
33
+ The first time you open a workspace, CipherCode silently scans your code and detects:
34
+
35
+ - Naming conventions (camelCase / snake_case / PascalCase)
36
+ - Function style (arrow vs named declarations)
37
+ - Async style (async/await vs `.then`)
38
+ - Comment placement and verbosity
39
+ - Indent size, semicolon preference, type-annotation density
40
+ - Your most-used libraries and imports
41
+
42
+ From that moment forward, every suggestion is generated to feel like *you* wrote it. Nothing leaves your machine β€” Persona lives entirely in VS Code's `globalState`.
43
+
44
+ ### Project Memory β€” Continuity That Actually Helps
45
+
46
+ CipherCode remembers your project across sessions:
47
+
48
+ | What's tracked | Where |
49
+ |---|---|
50
+ | Project summary (auto-detected from `package.json` / README) | `.vscode/cipher-memory.json` |
51
+ | Project type (`node` / `python` / `other`) | local |
52
+ | Top 10 most-edited files | local |
53
+ | Architectural decisions you've made | local |
54
+ | Last 20 chat messages | local |
55
+ | Recurring patterns in your code | local |
56
+
57
+ This context is injected into every prompt, so when you come back tomorrow, the model already knows what you're building.
58
+
59
+ ### Smart Commands
60
+
61
+ Right-click anywhere in your editor:
62
+
63
+ - **Explain Code** β€” clear summary of what's happening, even without a selection
64
+ - **Refactor Code** β€” clean up while preserving your style
65
+ - **Fix Bug** β€” find and patch issues, style-matched
66
+ - **Add Comments** β€” comment in your voice
67
+ - **Document This File** β€” language-aware doc comments (TSDoc / JSDoc / Google Python / Javadoc / XMLDoc / Doxygen / godoc / rustdoc / PHPDoc / YARD)
68
+ - **Generate README from Project** β€” full README from your code structure
69
+
70
+ Plus an inline chat sidebar with persistent history, code-block copy buttons, "Insert at cursor" actions, and a stop button that actually stops.
71
+
72
+ ### Privacy by Architecture
73
 
74
+ - Code stays on your machine β€” only the snippet you act on hits inference
75
+ - Persona never leaves your laptop
76
+ - Project memory lives in your workspace, not a Lila AI server
77
+ - Self-hostable on your own GCP if you want full ownership
78
+ - No telemetry, no accounts, no subscription
79
 
80
+ ---
81
+
82
+ ## Powered By
83
+
84
+ Built on **[Qwen2.5-Coder-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct)** β€” Alibaba's state-of-the-art open code model β€” quantized to **Q4_K_M** for efficient CPU inference and packaged for deployment via `llama.cpp`.
85
+
86
+ The intelligence in CipherCode comes from layering Persona detection, Project Memory, and carefully designed prompt templates on top of a strong base. The CipherCode VS Code extension orchestrates all of it; this repo hosts the weights it serves.
87
+
88
+ A LoRA fine-tune is on the roadmap for v0.2 β€” trained on real-world IDE workflow patterns collected during the closed beta.
89
+
90
+ ## Specifications
91
+
92
+ | | |
93
+ |---|---|
94
+ | **Architecture** | Qwen2.5-Coder transformer |
95
+ | **Parameters** | 1.5 B |
96
+ | **Context window** | 32 K (production runs at 4 K for efficiency) |
97
+ | **Quantization** | Q4_K_M |
98
+ | **File size** | 1.07 GB |
99
+ | **License** | Apache 2.0 β€” free for commercial use |
100
+ | **Strong languages** | Python, JavaScript, TypeScript, Java, Go, Rust, C/C++ |
101
 
102
+ ## Quick Start
103
 
104
+ ### Easy path β€” install the VS Code extension
105
+
106
+ If Lila AI sent you the closed-beta `.vsix`:
107
 
108
  ```bash
 
109
  code --install-extension ciphercode-0.1.0.vsix
110
  ```
111
 
112
+ Open VS Code. Welcome walkthrough opens automatically. Start typing. No setup, no token, no GCP.
113
 
114
+ ### Hands-on path β€” run the model locally
115
 
116
  ```bash
117
+ # Pull the GGUF
118
+ hf download guhantech/CipherModel-1.5B \
119
+ CipherModel-1.5B-Q4_K_M.gguf --local-dir .
120
 
121
+ # Serve with llama-server
122
  llama-server \
123
  -m CipherModel-1.5B-Q4_K_M.gguf \
124
  --host 0.0.0.0 --port 8080 \
125
  --ctx-size 4096 -np 5
126
 
127
+ # Make a request
128
  curl -X POST http://localhost:8080/v1/chat/completions \
129
  -H "Content-Type: application/json" \
130
+ -d '{
131
+ "model": "cipher-model",
132
+ "messages": [{"role":"user","content":"write a python fizzbuzz"}],
133
+ "max_tokens": 256
134
+ }'
135
  ```
136
 
137
+ ### Python (`llama-cpp-python`)
138
 
139
  ```python
140
  from llama_cpp import Llama
141
+
142
  llm = Llama(model_path="CipherModel-1.5B-Q4_K_M.gguf", n_ctx=4096)
143
  out = llm("def fizzbuzz(n):", max_tokens=256)
144
  print(out["choices"][0]["text"])
145
  ```
146
 
147
+ ## Roadmap
 
 
 
 
 
 
 
148
 
149
+ | Version | Status | What's in it |
150
+ |---|---|---|
151
+ | **v0.1** | Live | Closed beta. Cipher Persona + Project Memory + 11 commands + chat sidebar. |
152
+ | **v0.2** | Planned | LoRA fine-tune on collected IDE workflows. Better instruction-following. |
153
+ | **v0.3** | Planned | Multi-file context awareness. Whole-project doc generation. |
154
+ | **v1.0** | Planned | Public Marketplace launch. Optional hosted Pro tier for zero-setup. |
 
155
 
156
+ ## Citation
157
 
158
  ```bibtex
159
  @article{hui2024qwen2,
160
  title={Qwen2.5-Coder Technical Report},
161
+ author={Hui, Binyuan and Yang, Jian and Cui, Zeyu and others},
162
  journal={arXiv preprint arXiv:2409.12186},
163
  year={2024}
164
  }
 
166
 
167
  ## Trademark
168
 
169
+ **CipherCode** and **Cipher Persona** are trademarks of **Lila AI LLC**. All rights reserved.
170
 
171
+ The model weights are released under Apache 2.0 β€” free to use, modify, and redistribute. Trademarks restrict only how you may name and brand derivative work; the underlying weights remain unrestricted.
172
 
173
  ---
174
 
175
+ <sub>Β© 2026 Lila AI LLC Β· Built for developers who don't want their AI to sound like Stack Overflow.</sub>