Instructions to use Jerry999/TempSFTSkill with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Jerry999/TempSFTSkill with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Jerry999/TempSFTSkill")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Jerry999/TempSFTSkill")
model = AutoModelForCausalLM.from_pretrained("Jerry999/TempSFTSkill")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Jerry999/TempSFTSkill with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Jerry999/TempSFTSkill"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Jerry999/TempSFTSkill",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Jerry999/TempSFTSkill

SGLang

How to use Jerry999/TempSFTSkill with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Jerry999/TempSFTSkill" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Jerry999/TempSFTSkill",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Jerry999/TempSFTSkill" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Jerry999/TempSFTSkill",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Jerry999/TempSFTSkill with Docker Model Runner:
```
docker model run hf.co/Jerry999/TempSFTSkill
```

Jerry999 commited on 4 days ago

Commit

a4e7793

verified ·

1 Parent(s): 852afb7

Clear v1 outputs (60 files) before uploading v2

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

README.md +0 -58
chat_template.jinja +0 -61
checkpoint-1000/chat_template.jinja +0 -61
checkpoint-1000/config.json +0 -71
checkpoint-1000/generation_config.json +0 -12
checkpoint-1000/model.safetensors +0 -3
checkpoint-1000/optimizer.bin +0 -3
checkpoint-1000/pytorch_model_fsdp.bin +0 -3
checkpoint-1000/rng_state_0.pth +0 -3
checkpoint-1000/rng_state_1.pth +0 -3
checkpoint-1000/scheduler.pt +0 -3
checkpoint-1000/tokenizer.json +0 -3
checkpoint-1000/tokenizer_config.json +0 -29
checkpoint-1000/trainer_state.json +0 -0
checkpoint-1000/training_args.bin +0 -3
checkpoint-2000/chat_template.jinja +0 -61
checkpoint-2000/config.json +0 -71
checkpoint-2000/generation_config.json +0 -12
checkpoint-2000/model.safetensors +0 -3
checkpoint-2000/optimizer.bin +0 -3
checkpoint-2000/pytorch_model_fsdp.bin +0 -3
checkpoint-2000/rng_state_0.pth +0 -3
checkpoint-2000/rng_state_1.pth +0 -3
checkpoint-2000/scheduler.pt +0 -3
checkpoint-2000/tokenizer.json +0 -3
checkpoint-2000/tokenizer_config.json +0 -29
checkpoint-2000/trainer_state.json +0 -0
checkpoint-2000/training_args.bin +0 -3
checkpoint-3000/chat_template.jinja +0 -61
checkpoint-3000/config.json +0 -71
checkpoint-3000/generation_config.json +0 -12
checkpoint-3000/model.safetensors +0 -3
checkpoint-3000/optimizer.bin +0 -3
checkpoint-3000/pytorch_model_fsdp.bin +0 -3
checkpoint-3000/rng_state_0.pth +0 -3
checkpoint-3000/rng_state_1.pth +0 -3
checkpoint-3000/scheduler.pt +0 -3
checkpoint-3000/tokenizer.json +0 -3
checkpoint-3000/tokenizer_config.json +0 -29
checkpoint-3000/trainer_state.json +0 -0
checkpoint-3000/training_args.bin +0 -3
checkpoint-3948/chat_template.jinja +0 -61
checkpoint-3948/config.json +0 -71
checkpoint-3948/generation_config.json +0 -12
checkpoint-3948/model.safetensors +0 -3
checkpoint-3948/optimizer.bin +0 -3
checkpoint-3948/pytorch_model_fsdp.bin +0 -3
checkpoint-3948/rng_state_0.pth +0 -3
checkpoint-3948/rng_state_1.pth +0 -3
checkpoint-3948/scheduler.pt +0 -3

README.md DELETED Viewed

@@ -1,58 +0,0 @@
----
-base_model: Qwen/Qwen3-4B-Instruct-2507
-library_name: transformers
-model_name: Qwen3-8B_n3000_math
-tags:
-- generated_from_trainer
-- sft
-- trl
-licence: license
----
-# Model Card for Qwen3-8B_n3000_math
-This model is a fine-tuned version of [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507).
-It has been trained using [TRL](https://github.com/huggingface/trl).
-## Quick start
-```python
-from transformers import pipeline
-question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
-generator = pipeline("text-generation", model="None", device="cuda")
-output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
-print(output["generated_text"])
-```
-## Training procedure
-This model was trained with SFT.
-### Framework versions
-- TRL: 0.29.0
-- Transformers: 5.5.3
-- Pytorch: 2.8.0
-- Datasets: 4.5.0
-- Tokenizers: 0.22.2
-## Citations
-Cite TRL as:
-```bibtex
-@software{vonwerra2020trl,
-  title   = {{TRL: Transformers Reinforcement Learning}},
-  author  = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin},
-  license = {Apache-2.0},
-  url     = {https://github.com/huggingface/trl},
-  year    = {2020}
-}
-```

chat_template.jinja DELETED Viewed

@@ -1,61 +0,0 @@
-{%- if tools %}
-    {{- '<|im_start|>system\n' }}
-    {%- if messages[0].role == 'system' %}
-        {{- messages[0].content + '\n\n' }}
-    {%- endif %}
-    {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
-    {%- for tool in tools %}
-        {{- "\n" }}
-        {{- tool | tojson }}
-    {%- endfor %}
-    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
-{%- else %}
-    {%- if messages[0].role == 'system' %}
-        {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
-    {%- endif %}
-{%- endif %}
-{%- for message in messages %}
-    {%- if message.content is string %}
-        {%- set content = message.content %}
-    {%- else %}
-        {%- set content = '' %}
-    {%- endif %}
-    {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
-        {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
-    {%- elif message.role == "assistant" %}
-        {{- '<|im_start|>' + message.role + '\n' + content }}
-        {%- if message.tool_calls %}
-            {%- for tool_call in message.tool_calls %}
-                {%- if (loop.first and content) or (not loop.first) %}
-                    {{- '\n' }}
-                {%- endif %}
-                {%- if tool_call.function %}
-                    {%- set tool_call = tool_call.function %}
-                {%- endif %}
-                {{- '<tool_call>\n{"name": "' }}
-                {{- tool_call.name }}
-                {{- '", "arguments": ' }}
-                {%- if tool_call.arguments is string %}
-                    {{- tool_call.arguments }}
-                {%- else %}
-                    {{- tool_call.arguments | tojson }}
-                {%- endif %}
-                {{- '}\n</tool_call>' }}
-            {%- endfor %}
-        {%- endif %}
-        {{- '<|im_end|>\n' }}
-    {%- elif message.role == "tool" %}
-        {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
-            {{- '<|im_start|>user' }}
-        {%- endif %}
-        {{- '\n<tool_response>\n' }}
-        {{- content }}
-        {{- '\n</tool_response>' }}
-        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
-            {{- '<|im_end|>\n' }}
-        {%- endif %}
-    {%- endif %}
-{%- endfor %}
-{%- if add_generation_prompt %}
-    {{- '<|im_start|>assistant\n' }}
-{%- endif %}

checkpoint-1000/chat_template.jinja DELETED Viewed

@@ -1,61 +0,0 @@
-{%- if tools %}
-    {{- '<|im_start|>system\n' }}
-    {%- if messages[0].role == 'system' %}
-        {{- messages[0].content + '\n\n' }}
-    {%- endif %}
-    {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
-    {%- for tool in tools %}
-        {{- "\n" }}
-        {{- tool | tojson }}
-    {%- endfor %}
-    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
-{%- else %}
-    {%- if messages[0].role == 'system' %}
-        {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
-    {%- endif %}
-{%- endif %}
-{%- for message in messages %}
-    {%- if message.content is string %}
-        {%- set content = message.content %}
-    {%- else %}
-        {%- set content = '' %}
-    {%- endif %}
-    {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
-        {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
-    {%- elif message.role == "assistant" %}
-        {{- '<|im_start|>' + message.role + '\n' + content }}
-        {%- if message.tool_calls %}
-            {%- for tool_call in message.tool_calls %}
-                {%- if (loop.first and content) or (not loop.first) %}
-                    {{- '\n' }}
-                {%- endif %}
-                {%- if tool_call.function %}
-                    {%- set tool_call = tool_call.function %}
-                {%- endif %}
-                {{- '<tool_call>\n{"name": "' }}
-                {{- tool_call.name }}
-                {{- '", "arguments": ' }}
-                {%- if tool_call.arguments is string %}
-                    {{- tool_call.arguments }}
-                {%- else %}
-                    {{- tool_call.arguments | tojson }}
-                {%- endif %}
-                {{- '}\n</tool_call>' }}
-            {%- endfor %}
-        {%- endif %}
-        {{- '<|im_end|>\n' }}
-    {%- elif message.role == "tool" %}
-        {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
-            {{- '<|im_start|>user' }}
-        {%- endif %}
-        {{- '\n<tool_response>\n' }}
-        {{- content }}
-        {{- '\n</tool_response>' }}
-        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
-            {{- '<|im_end|>\n' }}
-        {%- endif %}
-    {%- endif %}
-{%- endfor %}
-{%- if add_generation_prompt %}
-    {{- '<|im_start|>assistant\n' }}
-{%- endif %}

checkpoint-1000/config.json DELETED Viewed

@@ -1,71 +0,0 @@
-{
-  "architectures": [
-    "Qwen3ForCausalLM"
-  ],
-  "attention_bias": false,
-  "attention_dropout": 0.0,
-  "bos_token_id": null,
-  "dtype": "float32",
-  "eos_token_id": 151645,
-  "head_dim": 128,
-  "hidden_act": "silu",
-  "hidden_size": 2560,
-  "initializer_range": 0.02,
-  "intermediate_size": 9728,
-  "layer_types": [
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention"
-  ],
-  "max_position_embeddings": 262144,
-  "max_window_layers": 36,
-  "model_type": "qwen3",
-  "num_attention_heads": 32,
-  "num_hidden_layers": 36,
-  "num_key_value_heads": 8,
-  "pad_token_id": 151662,
-  "rms_norm_eps": 1e-06,
-  "rope_parameters": {
-    "rope_theta": 5000000,
-    "rope_type": "default"
-  },
-  "sliding_window": null,
-  "tie_word_embeddings": true,
-  "transformers_version": "5.5.3",
-  "use_cache": false,
-  "use_sliding_window": false,
-  "vocab_size": 151936
-}

checkpoint-1000/generation_config.json DELETED Viewed

@@ -1,12 +0,0 @@
-{
-  "do_sample": true,
-  "eos_token_id": [
-    151645,
-    151643
-  ],
-  "pad_token_id": 151662,
-  "temperature": 0.7,
-  "top_k": 20,
-  "top_p": 0.8,
-  "transformers_version": "5.5.3"
-}

checkpoint-1000/model.safetensors DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:16cf530a69292d5ebcdc898ff6e27f40e9fa97d07ec9a6fff92606a1cbec50f4
-size 17645743048

checkpoint-1000/optimizer.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:ad09a9b1f9d56fb5e24fccb31bc61995bcb8aa26d3d4e5771bcd332a90d2d66e
-size 32180124005

checkpoint-1000/pytorch_model_fsdp.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:cde7e1f8a53dcc9407e8636dd3c4261b755f26602abf7c70e6eb4291c93496bd
-size 17645897996

checkpoint-1000/rng_state_0.pth DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:4dd7671ce88d469c49c0530724ac76b2306574002d1ecd1ca9294e41621fd96a
-size 14917

checkpoint-1000/rng_state_1.pth DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:3246ef1170ccca541a03b89ad6f20e01c51eb6834a2c2211c78c71c70f896879
-size 14917

checkpoint-1000/scheduler.pt DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:3e3184dc815b4354af3c63c9b5b618608d5206305b4414657ef8e0195f7ad089
-size 1465

checkpoint-1000/tokenizer.json DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:be75606093db2094d7cd20f3c2f385c212750648bd6ea4fb2bf507a6a4c55506
-size 11422650

checkpoint-1000/tokenizer_config.json DELETED Viewed

@@ -1,29 +0,0 @@
-{
-  "add_prefix_space": false,
-  "backend": "tokenizers",
-  "bos_token": null,
-  "clean_up_tokenization_spaces": false,
-  "eos_token": "<|im_end|>",
-  "errors": "replace",
-  "extra_special_tokens": [
-    "<|im_start|>",
-    "<|im_end|>",
-    "<|object_ref_start|>",
-    "<|object_ref_end|>",
-    "<|box_start|>",
-    "<|box_end|>",
-    "<|quad_start|>",
-    "<|quad_end|>",
-    "<|vision_start|>",
-    "<|vision_end|>",
-    "<|vision_pad|>",
-    "<|image_pad|>",
-    "<|video_pad|>"
-  ],
-  "is_local": false,
-  "model_max_length": 1010000,
-  "pad_token": "<|fim_pad|>",
-  "split_special_tokens": false,
-  "tokenizer_class": "Qwen2Tokenizer",
-  "unk_token": null
-}

checkpoint-1000/trainer_state.json DELETED Viewed

The diff for this file is too large to render. See raw diff

checkpoint-1000/training_args.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:bb9e429a6dba8782c1beb1411b31fa91f0c01ec6e0b1441e21d679f8a8b2c021
-size 6225

checkpoint-2000/chat_template.jinja DELETED Viewed

@@ -1,61 +0,0 @@
-{%- if tools %}
-    {{- '<|im_start|>system\n' }}
-    {%- if messages[0].role == 'system' %}
-        {{- messages[0].content + '\n\n' }}
-    {%- endif %}
-    {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
-    {%- for tool in tools %}
-        {{- "\n" }}
-        {{- tool | tojson }}
-    {%- endfor %}
-    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
-{%- else %}
-    {%- if messages[0].role == 'system' %}
-        {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
-    {%- endif %}
-{%- endif %}
-{%- for message in messages %}
-    {%- if message.content is string %}
-        {%- set content = message.content %}
-    {%- else %}
-        {%- set content = '' %}
-    {%- endif %}
-    {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
-        {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
-    {%- elif message.role == "assistant" %}
-        {{- '<|im_start|>' + message.role + '\n' + content }}
-        {%- if message.tool_calls %}
-            {%- for tool_call in message.tool_calls %}
-                {%- if (loop.first and content) or (not loop.first) %}
-                    {{- '\n' }}
-                {%- endif %}
-                {%- if tool_call.function %}
-                    {%- set tool_call = tool_call.function %}
-                {%- endif %}
-                {{- '<tool_call>\n{"name": "' }}
-                {{- tool_call.name }}
-                {{- '", "arguments": ' }}
-                {%- if tool_call.arguments is string %}
-                    {{- tool_call.arguments }}
-                {%- else %}
-                    {{- tool_call.arguments | tojson }}
-                {%- endif %}
-                {{- '}\n</tool_call>' }}
-            {%- endfor %}
-        {%- endif %}
-        {{- '<|im_end|>\n' }}
-    {%- elif message.role == "tool" %}
-        {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
-            {{- '<|im_start|>user' }}
-        {%- endif %}
-        {{- '\n<tool_response>\n' }}
-        {{- content }}
-        {{- '\n</tool_response>' }}
-        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
-            {{- '<|im_end|>\n' }}
-        {%- endif %}
-    {%- endif %}
-{%- endfor %}
-{%- if add_generation_prompt %}
-    {{- '<|im_start|>assistant\n' }}
-{%- endif %}

checkpoint-2000/config.json DELETED Viewed

@@ -1,71 +0,0 @@
-{
-  "architectures": [
-    "Qwen3ForCausalLM"
-  ],
-  "attention_bias": false,
-  "attention_dropout": 0.0,
-  "bos_token_id": null,
-  "dtype": "float32",
-  "eos_token_id": 151645,
-  "head_dim": 128,
-  "hidden_act": "silu",
-  "hidden_size": 2560,
-  "initializer_range": 0.02,
-  "intermediate_size": 9728,
-  "layer_types": [
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention"
-  ],
-  "max_position_embeddings": 262144,
-  "max_window_layers": 36,
-  "model_type": "qwen3",
-  "num_attention_heads": 32,
-  "num_hidden_layers": 36,
-  "num_key_value_heads": 8,
-  "pad_token_id": 151662,
-  "rms_norm_eps": 1e-06,
-  "rope_parameters": {
-    "rope_theta": 5000000,
-    "rope_type": "default"
-  },
-  "sliding_window": null,
-  "tie_word_embeddings": true,
-  "transformers_version": "5.5.3",
-  "use_cache": false,
-  "use_sliding_window": false,
-  "vocab_size": 151936
-}

checkpoint-2000/generation_config.json DELETED Viewed

@@ -1,12 +0,0 @@
-{
-  "do_sample": true,
-  "eos_token_id": [
-    151645,
-    151643
-  ],
-  "pad_token_id": 151662,
-  "temperature": 0.7,
-  "top_k": 20,
-  "top_p": 0.8,
-  "transformers_version": "5.5.3"
-}

checkpoint-2000/model.safetensors DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:1b1ce241be74f81ade1793d7d1184e1cf7ce2e9afe46f5dd9418012bd1861b43
-size 17645743048

checkpoint-2000/optimizer.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:07e07657f743306d7736d8218c799dfc731283d7dedfca7eb48d4dcc64c64623
-size 32180124005

checkpoint-2000/pytorch_model_fsdp.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:27df8f98b77baf9afbd9bdac0a9ff6cc9e53f4d44310a5d8c665d45656911b2e
-size 17645897996

checkpoint-2000/rng_state_0.pth DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:95e5fc2074c0df31522a514f862c86cb00d71c946a7f15cc9ec0e53a69fb28a7
-size 14917

checkpoint-2000/rng_state_1.pth DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:0e7153eae67b6c9232a41bc996a2bf5b83229b8c7230d61911ac0fd40e64154e
-size 14917

checkpoint-2000/scheduler.pt DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:7c70c34042f727a1ef06eb662d77f90fe87f01cf21415dce97c8cb4c779b5625
-size 1465

checkpoint-2000/tokenizer.json DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:be75606093db2094d7cd20f3c2f385c212750648bd6ea4fb2bf507a6a4c55506
-size 11422650

checkpoint-2000/tokenizer_config.json DELETED Viewed

@@ -1,29 +0,0 @@
-{
-  "add_prefix_space": false,
-  "backend": "tokenizers",
-  "bos_token": null,
-  "clean_up_tokenization_spaces": false,
-  "eos_token": "<|im_end|>",
-  "errors": "replace",
-  "extra_special_tokens": [
-    "<|im_start|>",
-    "<|im_end|>",
-    "<|object_ref_start|>",
-    "<|object_ref_end|>",
-    "<|box_start|>",
-    "<|box_end|>",
-    "<|quad_start|>",
-    "<|quad_end|>",
-    "<|vision_start|>",
-    "<|vision_end|>",
-    "<|vision_pad|>",
-    "<|image_pad|>",
-    "<|video_pad|>"
-  ],
-  "is_local": false,
-  "model_max_length": 1010000,
-  "pad_token": "<|fim_pad|>",
-  "split_special_tokens": false,
-  "tokenizer_class": "Qwen2Tokenizer",
-  "unk_token": null
-}

checkpoint-2000/trainer_state.json DELETED Viewed

The diff for this file is too large to render. See raw diff

checkpoint-2000/training_args.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:bb9e429a6dba8782c1beb1411b31fa91f0c01ec6e0b1441e21d679f8a8b2c021
-size 6225

checkpoint-3000/chat_template.jinja DELETED Viewed

@@ -1,61 +0,0 @@
-{%- if tools %}
-    {{- '<|im_start|>system\n' }}
-    {%- if messages[0].role == 'system' %}
-        {{- messages[0].content + '\n\n' }}
-    {%- endif %}
-    {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
-    {%- for tool in tools %}
-        {{- "\n" }}
-        {{- tool | tojson }}
-    {%- endfor %}
-    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
-{%- else %}
-    {%- if messages[0].role == 'system' %}
-        {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
-    {%- endif %}
-{%- endif %}
-{%- for message in messages %}
-    {%- if message.content is string %}
-        {%- set content = message.content %}
-    {%- else %}
-        {%- set content = '' %}
-    {%- endif %}
-    {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
-        {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
-    {%- elif message.role == "assistant" %}
-        {{- '<|im_start|>' + message.role + '\n' + content }}
-        {%- if message.tool_calls %}
-            {%- for tool_call in message.tool_calls %}
-                {%- if (loop.first and content) or (not loop.first) %}
-                    {{- '\n' }}
-                {%- endif %}
-                {%- if tool_call.function %}
-                    {%- set tool_call = tool_call.function %}
-                {%- endif %}
-                {{- '<tool_call>\n{"name": "' }}
-                {{- tool_call.name }}
-                {{- '", "arguments": ' }}
-                {%- if tool_call.arguments is string %}
-                    {{- tool_call.arguments }}
-                {%- else %}
-                    {{- tool_call.arguments | tojson }}
-                {%- endif %}
-                {{- '}\n</tool_call>' }}
-            {%- endfor %}
-        {%- endif %}
-        {{- '<|im_end|>\n' }}
-    {%- elif message.role == "tool" %}
-        {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
-            {{- '<|im_start|>user' }}
-        {%- endif %}
-        {{- '\n<tool_response>\n' }}
-        {{- content }}
-        {{- '\n</tool_response>' }}
-        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
-            {{- '<|im_end|>\n' }}
-        {%- endif %}
-    {%- endif %}
-{%- endfor %}
-{%- if add_generation_prompt %}
-    {{- '<|im_start|>assistant\n' }}
-{%- endif %}

checkpoint-3000/config.json DELETED Viewed

@@ -1,71 +0,0 @@
-{
-  "architectures": [
-    "Qwen3ForCausalLM"
-  ],
-  "attention_bias": false,
-  "attention_dropout": 0.0,
-  "bos_token_id": null,
-  "dtype": "float32",
-  "eos_token_id": 151645,
-  "head_dim": 128,
-  "hidden_act": "silu",
-  "hidden_size": 2560,
-  "initializer_range": 0.02,
-  "intermediate_size": 9728,
-  "layer_types": [
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention"
-  ],
-  "max_position_embeddings": 262144,
-  "max_window_layers": 36,
-  "model_type": "qwen3",
-  "num_attention_heads": 32,
-  "num_hidden_layers": 36,
-  "num_key_value_heads": 8,
-  "pad_token_id": 151662,
-  "rms_norm_eps": 1e-06,
-  "rope_parameters": {
-    "rope_theta": 5000000,
-    "rope_type": "default"
-  },
-  "sliding_window": null,
-  "tie_word_embeddings": true,
-  "transformers_version": "5.5.3",
-  "use_cache": false,
-  "use_sliding_window": false,
-  "vocab_size": 151936
-}

checkpoint-3000/generation_config.json DELETED Viewed

@@ -1,12 +0,0 @@
-{
-  "do_sample": true,
-  "eos_token_id": [
-    151645,
-    151643
-  ],
-  "pad_token_id": 151662,
-  "temperature": 0.7,
-  "top_k": 20,
-  "top_p": 0.8,
-  "transformers_version": "5.5.3"
-}

checkpoint-3000/model.safetensors DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:0a87a133eb5ec5af0878395bc45e179834b11224819f981211f70acdd015060b
-size 17645743048

checkpoint-3000/optimizer.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:0ff8e5977667fc938b297528391c931889487050b2acf34a78a42a820912cd38
-size 32180124005

checkpoint-3000/pytorch_model_fsdp.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:3023a52ce183c0d2cddf839ebf937f5047e153db9c651eb9f295b9a386e6b589
-size 17645897996

checkpoint-3000/rng_state_0.pth DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:61e957b4cd785256be4cb26eb03060ef689e1d58f1766d7f26ca36a62bec4994
-size 14917

checkpoint-3000/rng_state_1.pth DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:550c54d430b44b77b0abe44c6e3ceba90a155305315c081b7616b35e2c18d1ce
-size 14917

checkpoint-3000/scheduler.pt DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:b07c9eca675fb8c47d0c01728c4ef879c66a752ffdace85e7e9feac32b48ac4b
-size 1465

checkpoint-3000/tokenizer.json DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:be75606093db2094d7cd20f3c2f385c212750648bd6ea4fb2bf507a6a4c55506
-size 11422650

checkpoint-3000/tokenizer_config.json DELETED Viewed

@@ -1,29 +0,0 @@
-{
-  "add_prefix_space": false,
-  "backend": "tokenizers",
-  "bos_token": null,
-  "clean_up_tokenization_spaces": false,
-  "eos_token": "<|im_end|>",
-  "errors": "replace",
-  "extra_special_tokens": [
-    "<|im_start|>",
-    "<|im_end|>",
-    "<|object_ref_start|>",
-    "<|object_ref_end|>",
-    "<|box_start|>",
-    "<|box_end|>",
-    "<|quad_start|>",
-    "<|quad_end|>",
-    "<|vision_start|>",
-    "<|vision_end|>",
-    "<|vision_pad|>",
-    "<|image_pad|>",
-    "<|video_pad|>"
-  ],
-  "is_local": false,
-  "model_max_length": 1010000,
-  "pad_token": "<|fim_pad|>",
-  "split_special_tokens": false,
-  "tokenizer_class": "Qwen2Tokenizer",
-  "unk_token": null
-}

checkpoint-3000/trainer_state.json DELETED Viewed

The diff for this file is too large to render. See raw diff

checkpoint-3000/training_args.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:bb9e429a6dba8782c1beb1411b31fa91f0c01ec6e0b1441e21d679f8a8b2c021
-size 6225

checkpoint-3948/chat_template.jinja DELETED Viewed

@@ -1,61 +0,0 @@
-{%- if tools %}
-    {{- '<|im_start|>system\n' }}
-    {%- if messages[0].role == 'system' %}
-        {{- messages[0].content + '\n\n' }}
-    {%- endif %}
-    {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
-    {%- for tool in tools %}
-        {{- "\n" }}
-        {{- tool | tojson }}
-    {%- endfor %}
-    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
-{%- else %}
-    {%- if messages[0].role == 'system' %}
-        {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
-    {%- endif %}
-{%- endif %}
-{%- for message in messages %}
-    {%- if message.content is string %}
-        {%- set content = message.content %}
-    {%- else %}
-        {%- set content = '' %}
-    {%- endif %}
-    {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
-        {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
-    {%- elif message.role == "assistant" %}
-        {{- '<|im_start|>' + message.role + '\n' + content }}
-        {%- if message.tool_calls %}
-            {%- for tool_call in message.tool_calls %}
-                {%- if (loop.first and content) or (not loop.first) %}
-                    {{- '\n' }}
-                {%- endif %}
-                {%- if tool_call.function %}
-                    {%- set tool_call = tool_call.function %}
-                {%- endif %}
-                {{- '<tool_call>\n{"name": "' }}
-                {{- tool_call.name }}
-                {{- '", "arguments": ' }}
-                {%- if tool_call.arguments is string %}
-                    {{- tool_call.arguments }}
-                {%- else %}
-                    {{- tool_call.arguments | tojson }}
-                {%- endif %}
-                {{- '}\n</tool_call>' }}
-            {%- endfor %}
-        {%- endif %}
-        {{- '<|im_end|>\n' }}
-    {%- elif message.role == "tool" %}
-        {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
-            {{- '<|im_start|>user' }}
-        {%- endif %}
-        {{- '\n<tool_response>\n' }}
-        {{- content }}
-        {{- '\n</tool_response>' }}
-        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
-            {{- '<|im_end|>\n' }}
-        {%- endif %}
-    {%- endif %}
-{%- endfor %}
-{%- if add_generation_prompt %}
-    {{- '<|im_start|>assistant\n' }}
-{%- endif %}

checkpoint-3948/config.json DELETED Viewed

@@ -1,71 +0,0 @@
-{
-  "architectures": [
-    "Qwen3ForCausalLM"
-  ],
-  "attention_bias": false,
-  "attention_dropout": 0.0,
-  "bos_token_id": null,
-  "dtype": "float32",
-  "eos_token_id": 151645,
-  "head_dim": 128,
-  "hidden_act": "silu",
-  "hidden_size": 2560,
-  "initializer_range": 0.02,
-  "intermediate_size": 9728,
-  "layer_types": [
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention",
-    "full_attention"
-  ],
-  "max_position_embeddings": 262144,
-  "max_window_layers": 36,
-  "model_type": "qwen3",
-  "num_attention_heads": 32,
-  "num_hidden_layers": 36,
-  "num_key_value_heads": 8,
-  "pad_token_id": 151662,
-  "rms_norm_eps": 1e-06,
-  "rope_parameters": {
-    "rope_theta": 5000000,
-    "rope_type": "default"
-  },
-  "sliding_window": null,
-  "tie_word_embeddings": true,
-  "transformers_version": "5.5.3",
-  "use_cache": false,
-  "use_sliding_window": false,
-  "vocab_size": 151936
-}

checkpoint-3948/generation_config.json DELETED Viewed

@@ -1,12 +0,0 @@
-{
-  "do_sample": true,
-  "eos_token_id": [
-    151645,
-    151643
-  ],
-  "pad_token_id": 151662,
-  "temperature": 0.7,
-  "top_k": 20,
-  "top_p": 0.8,
-  "transformers_version": "5.5.3"
-}

checkpoint-3948/model.safetensors DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:e7db19800bbcf792dcb25dea9b5ae39f4e934a0d56f64ed6f74d7d89e87ae928
-size 17645743048

checkpoint-3948/optimizer.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:656d334c407ae1443fcaeda271d597e51249875fdde8e1a12a024812f6de73ab
-size 32180124005

checkpoint-3948/pytorch_model_fsdp.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:51d19fbc90bb938bf3c747a8b9c2b23f00398029d4ab146ca0ca0a0ea7d8885c
-size 17645897996

checkpoint-3948/rng_state_0.pth DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:61e957b4cd785256be4cb26eb03060ef689e1d58f1766d7f26ca36a62bec4994
-size 14917

checkpoint-3948/rng_state_1.pth DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:550c54d430b44b77b0abe44c6e3ceba90a155305315c081b7616b35e2c18d1ce
-size 14917

checkpoint-3948/scheduler.pt DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:deaab1725fa5d6abb332a09b31b7c4d93808c0289cb39a32cd5102547b98e285
-size 1465