Example Test

import onnxruntime as ort
import numpy as np
from transformers import AutoTokenizer

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("CohenQu/DeepSeek-R1-Distill-Qwen-7B-GRPO")

# Load the ONNX model
onnx_model_path = "model.onnx"  # Path to your ONNX model
session = ort.InferenceSession(onnx_model_path)
question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
inputs = tokenizer(question, return_tensors="np", padding=True)
output = session.run(
    None,  # Output names can be None to get all outputs
    {'input_ids': inputs['input_ids'],'attention_mask': inputs['attention_mask']}
)[0]
generated_text = tokenizer.decode(np.argmax(output, axis=-1)[0], skip_special_tokens=True)
print("Generated Text:", generated_text)
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cfu/Qwen-1.5B-onnx