Text Generation
Transformers
Safetensors
English
llama
text2cypher
cypher
graph-databases
supervised-fine-tuning
cpu-training
conversational
text-generation-inference
Instructions to use oscardean/smollm2-135m-text2cypher with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use oscardean/smollm2-135m-text2cypher with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="oscardean/smollm2-135m-text2cypher") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("oscardean/smollm2-135m-text2cypher") model = AutoModelForMultimodalLM.from_pretrained("oscardean/smollm2-135m-text2cypher") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use oscardean/smollm2-135m-text2cypher with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "oscardean/smollm2-135m-text2cypher" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "oscardean/smollm2-135m-text2cypher", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/oscardean/smollm2-135m-text2cypher
- SGLang
How to use oscardean/smollm2-135m-text2cypher with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "oscardean/smollm2-135m-text2cypher" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "oscardean/smollm2-135m-text2cypher", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "oscardean/smollm2-135m-text2cypher" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "oscardean/smollm2-135m-text2cypher", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use oscardean/smollm2-135m-text2cypher with Docker Model Runner:
docker model run hf.co/oscardean/smollm2-135m-text2cypher
SmolLM2-135M Text2Cypher
Fine-tuned HuggingFaceTB/SmolLM2-135M-Instruct for generating Cypher queries from a graph schema and a natural-language question.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "oscardean/smollm2-135m-text2cypher"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
messages = [
{
"role": "system",
"content": (
"You translate natural-language questions into Cypher queries. "
"Use only the supplied graph schema and return only the Cypher query."
),
},
{
"role": "user",
"content": (
"Graph schema:\n"
"Person {name: STRING}\n"
"Movie {title: STRING, year: INTEGER}\n"
"(Person)-[:DIRECTED]->(Movie)\n\n"
"Question:\n"
"Which movies did Christopher Nolan direct before 2010?"
),
},
]
inputs = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_tensors="pt",
)
outputs = model.generate(
inputs,
max_new_tokens=192,
do_sample=False,
)
prediction = tokenizer.decode(
outputs[0, inputs.shape[1]:],
skip_special_tokens=True,
)
print(prediction)
Training
| Hyperparameter | Value |
|---|---|
| Training samples | 1,000 |
| Validation samples | 75 |
| Epochs | 3 |
| Learning rate | 5e-5 |
| Batch size | 2 |
| Gradient accumulation | 4 |
| Effective batch size | 8 |
| Weight decay | 0.01 |
| Warmup ratio | 0.05 |
| Maximum sequence length | 800 |
| Decoding | Greedy |
| Checkpoint selection | Lowest validation loss |
Evaluation
Evaluated on the 50-sample test split.
| Metric | Base | Fine-tuned |
|---|---|---|
| Basic query structure | 2.00% | 100.00% |
| Token F1 | 12.35% | 55.20% |
| Node-label agreement | 0.00% | 58.00% |
| Component match rate | 29.20% | 49.60% |
| Normalized exact match | 0.00% | 0.00% |
Limitations
- May hallucinate labels, relationships, or properties.
- May omit filters, constants, or return fields.
- May repeat conditions.
- May use incorrect relationship directions or operators.
- May generate SQL-like syntax instead of valid Cypher.
- Can produce structurally plausible but semantically incorrect queries.
- Should be validated before execution.
- Not intended for direct production use.
- Downloads last month
- 83
Model tree for oscardean/smollm2-135m-text2cypher
Base model
HuggingFaceTB/SmolLM2-135M Quantized
HuggingFaceTB/SmolLM2-135M-Instruct