|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- mandanya/logseq-query-clojure-big |
|
language: |
|
- en |
|
base_model: |
|
- Qwen/Qwen2.5-Coder-0.5B-Instruct |
|
pipeline_tag: text-generation |
|
tags: |
|
- code |
|
library_name: transformers |
|
--- |
|
|
|
# Model for building advanced queries in Logseq |
|
Logseq use clojure script over datalog to interact with notes. |
|
LCQ - Logseq Clojure Query. |
|
### Description of the approach |
|
About 100 copies were collected manually and about 4500 more were created on their basis using the Qwen2.5-Coder-7B-Instruct model. The test part of the dataset (about 100 synthetic copies) are run through the model with a system prompt describing the specifics of the queries and validated by the codestral-mamba model. |
|
```python |
|
SYSTEM_PROMPT = """ |
|
You should create advanced query for logseq. |
|
Advanced query should be written in clojure script over datalog and should starts and ends with `#+BEGIN_QUERY` and `#+END_QUERY` respectively. |
|
You should respond only with query, without any additional information. |
|
|
|
query may consists of: |
|
- :title - title of the query (required) |
|
- :query - query itself, usually contains :find, :where, ... (required) |
|
- :result-transform - transform function for the result (optional) |
|
- :group-by-page? (true or false, optional) |
|
- :collapsed? (true or false, usually false, optional) |
|
|
|
example of respond: |
|
#+BEGIN_QUERY |
|
... |
|
#+END_QUERY |
|
""" |
|
``` |
|
|
|
### Results |
|
| model | overal | zero_shot | 1_shot | 3_shot | 5_shot | |
|
|:-------------------------------|---------:|------------:|---------:|---------:|---------:| |
|
| Qwen2.5-Coder-0.5B-LCQ-v2 | 0.3333 | 0.3333 | nan | nan | nan | |
|
| Qwen2.5-Coder-0.5B-LCQ-v1 | 0.2963 | 0.2963 | nan | nan | nan | |
|
| Qwen2.5-Coder-7B-Instruct-AWQ | 0.0586 | 0.0247 | 0.0494 | 0.0988 | 0.0617 | |
|
| gpt-4o | 0.0401 | 0.0123 | 0.0741 | 0.037 | 0.037 | |
|
| gpt-4o-mini | 0.034 | 0.0123 | 0.0247 | 0.0617 | 0.037 | |
|
| Qwen2.5-Coder-3B-Instruct | 0.0278 | 0 | 0.0123 | 0.0617 | 0.037 | |
|
| Qwen2.5-Coder-1.5B-Instruct | 0.0123 | 0 | 0 | 0.0123 | 0.037 | |
|
| Qwen2.5-Coder-0.5B-Instruct | 0.0031 | 0 | 0 | 0.0123 | 0 | |
|
### How to use |
|
I prefer to run model with sglang: |
|
```bash |
|
python3.11 -m venv .venv --prompt llm-inf |
|
|
|
source .venv/bin/activate |
|
|
|
pip install "sglang[all]" |
|
pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.3/ |
|
|
|
python3.11 -m sglang.launch_server \ |
|
--model-path mandanya/Qwen2.5-Coder-0.5B-LCQ-v2 \ |
|
--port 23335 \ |
|
--host 0.0.0.0 \ |
|
--mem-fraction-static 0.5 \ |
|
--served-model-name "Qwen2.5-Coder-0.5B-LCQ-v2" |
|
``` |