Qwen2.5-7B-Instruct-kowiki-qa-8bit mlx convert model
- Original model is beomi/Qwen2.5-7B-Instruct-kowiki-qa
Requirement
pip install mlx-lm
Usage
-
mlx_lm.generate --model mlx-community/Qwen2.5-7B-Instruct-kowiki-qa-8bit --prompt "νλμ΄ νλ μ΄μ κ° λμΌ?"
-
from mlx_lm import load, generate model, tokenizer = load( "mlx-community/Qwen2.5-7B-Instruct-kowiki-qa-8bit", tokenizer_config={"trust_remote_code": True}, ) prompt = "νλμ΄ νλ μ΄μ κ° λμΌ?" messages = [ {"role": "system", "content": "λΉμ μ μΉμ² ν μ±λ΄μ λλ€."}, {"role": "user", "content": prompt}, ] prompt = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True, ) text = generate( model, tokenizer, prompt=prompt, # verbose=True, # max_tokens=8196, # temp=0.0, )
-
mlx_lm.server --model mlx-community/Qwen2.5-7B-Instruct-kowiki-qa-8bit --host 0.0.0.0
import openai client = openai.OpenAI( base_url="http://localhost:8080/v1", ) prompt = "νλμ΄ νλ μ΄μ κ° λμΌ?" messages = [ {"role": "system", "content": "λΉμ μ μΉμ ν μ±λ΄μ λλ€.",}, {"role": "user", "content": prompt}, ] res = client.chat.completions.create( model='mlx-community/Qwen2.5-7B-Instruct-kowiki-qa-8bit', messages=messages, temperature=0.2, ) print(res.choices[0].message.content)
- Downloads last month
- 8
Inference API (serverless) does not yet support mlx models for this pipeline type.