mkurman
/

Qwen2.5-14B-DeepSeek-R1-1M

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Qwen2.5-14B-DeepSeek-R1-1M / README.md

mkurman's picture

Update README.md

cc81baa verified about 1 month ago

|

history blame contribute delete

1.56 kB

	---
	license: apache-2.0
	base_model:
	- deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
	- Qwen/Qwen2.5-14B-Instruct-1M
	pipeline_tag: text-generation
	library_name: transformers
	---

	# Qwen2.5-14B-DeepSeek-R1-1M

	A merged model combines the reasoning model's strengths (Qwen2.5-14B-DeepSeek-R1) and the long-context model capabilities (Qwen2.5-14B-Instruct-1M) for versatile performance.

	# Merge config

	```yaml
	models:
	- model: "deepseek-ai/DeepSeek-R1-Distill-Qwen-14B"
	parameters:
	weight: 1
	density: 1

	merge_method: ties
	base_model: "Qwen/Qwen2.5-14B-Instruct-1M"
	parameters:
	density: 1
	normalize: true
	int8_mask: true
	dtype: bfloat16
	```

	and I needed to make some minor adjustments to the tokenizer configuration.

	# How to Use
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "mkurman/Qwen2.5-14B-DeepSeek-R1-1M"
	model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
	tokenizer = AutoTokenizer.from_pretrained(model_name)

	prompt = "Write a Python script to merge two CSV files."
	messages = [
	{"role": "system", "content": "You are an expert programmer."},
	{"role": "user", "content": prompt}
	]
	inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
	outputs = model.generate(inputs, max_new_tokens=512)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	You can use it on `LM Studio` or `Ollama` by utilizing the provided GGUF files.

	# License
	Apache 2.0 for open-source contribution and collaboration.