hokar3361 commited on
Commit
b6aee7b
·
verified ·
1 Parent(s): ca1a830

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -21
README.md CHANGED
@@ -62,32 +62,13 @@ This release prioritizes **practical code generation quality** over benchmark sc
62
 
63
  ## Quickstart
64
 
65
- ### 1) Transformers (merged weights)
66
-
67
- ```python
68
- from transformers import AutoTokenizer, AutoModelForCausalLM
69
- import torch
70
-
71
- repo = "hokar3361/gpt-oss-coderjs-v0.1"
72
- tok = AutoTokenizer.from_pretrained(repo, use_fast=True)
73
- model = AutoModelForCausalLM.from_pretrained(
74
- repo,
75
- torch_dtype=torch.bfloat16,
76
- device_map="auto"
77
- )
78
-
79
- prompt = "```js\n// Write a function that flattens a nested array of numbers\n"
80
- inputs = tok(prompt, return_tensors="pt").to(model.device)
81
- out = model.generate(**inputs, max_new_tokens=128, temperature=0.3, do_sample=False)
82
- print(tok.decode(out[0], skip_special_tokens=True))
83
- 2) vLLM (recommended)
84
- bash
85
- コードをコピーする
86
  vllm serve hokar3361/gpt-oss-coderjs-v0.1 \
87
  --async-scheduling \
88
  --max-model-len 4096 \
89
  --gpu-memory-utilization 0.90
90
  For LoRA-only repos, add --lora-modules as per vLLM documentation.
 
91
 
92
  For merged weights, the above command is sufficient.
93
 
 
62
 
63
  ## Quickstart
64
 
65
+ ```bash
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
66
  vllm serve hokar3361/gpt-oss-coderjs-v0.1 \
67
  --async-scheduling \
68
  --max-model-len 4096 \
69
  --gpu-memory-utilization 0.90
70
  For LoRA-only repos, add --lora-modules as per vLLM documentation.
71
+ ```
72
 
73
  For merged weights, the above command is sufficient.
74