Example Test
import onnxruntime as ort
import numpy as np
from transformers import AutoTokenizer
# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("CohenQu/DeepSeek-R1-Distill-Qwen-7B-GRPO")
# Load the ONNX model
onnx_model_path = "model.onnx" # Path to your ONNX model
session = ort.InferenceSession(onnx_model_path)
question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
inputs = tokenizer(question, return_tensors="np", padding=True)
output = session.run(
None, # Output names can be None to get all outputs
{'input_ids': inputs['input_ids'],'attention_mask': inputs['attention_mask']}
)[0]
generated_text = tokenizer.decode(np.argmax(output, axis=-1)[0], skip_special_tokens=True)
print("Generated Text:", generated_text)
- Downloads last month
- 4
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
HF Inference deployability: The model has no library tag.
Model tree for cfu/Qwen-1.5B-onnx
Base model
CohenQu/DeepSeek-R1-Distill-Qwen-7B-GRPO