cfu/Qwen-1.5B-onnx · Hugging Face

Example Test

import onnxruntime as ort
import numpy as np
from transformers import AutoTokenizer

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("CohenQu/DeepSeek-R1-Distill-Qwen-7B-GRPO")

# Load the ONNX model
onnx_model_path = "model.onnx"  # Path to your ONNX model
session = ort.InferenceSession(onnx_model_path)
question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
inputs = tokenizer(question, return_tensors="np", padding=True)
output = session.run(
    None,  # Output names can be None to get all outputs
    {'input_ids': inputs['input_ids'],'attention_mask': inputs['attention_mask']}
)[0]
generated_text = tokenizer.decode(np.argmax(output, axis=-1)[0], skip_special_tokens=True)
print("Generated Text:", generated_text)

cfu
/

Qwen-1.5B-onnx

Example Test

Model tree for cfu/Qwen-1.5B-onnx