OpenOneRec commited on
Commit
6b60859
Β·
verified Β·
1 Parent(s): 4f781c9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -48,6 +48,7 @@ This repository contains:
48
 
49
  - **2026-04-28** β€” KSA technical report is released on arXiv: [arXiv:2604.24432](https://arxiv.org/abs/2604.24432).
50
  - **2026-04-28** β€” Code, training recipes, block-sparse kernel, and HuggingFace `trust_remote_code` template are open-sourced under this repository.
 
51
 
52
  ## ✨ Highlights
53
 
@@ -60,11 +61,11 @@ This repository contains:
60
 
61
  ## πŸ€– Model Zoo
62
 
63
- *Coming soon.* Pretrained checkpoints will be published on Hugging Face once the technical report is released.
64
 
65
  | Model | Backbone | Parameters | Context | Training | Link |
66
  | :------------ | :---------- | :--------- | :------ | :-------------------- | :---- |
67
- | KSA-4B (CPT) | Qwen3-4B | 4B | 128k | Continual pretraining | *TBD* |
68
 
69
  The 1.9B *from-scratch* configuration is provided as a reproducible recipe only; no 1.9B weights will be released.
70
 
@@ -257,8 +258,7 @@ The inference path uses HuggingFace's `AutoModelForCausalLM` with `trust_remote_
257
  We are actively working on:
258
 
259
  - [x] Technical report on arXiv ([arXiv:2604.24432](https://arxiv.org/abs/2604.24432)).
260
- - [ ] Publish pretrained 1.9B checkpoints on Hugging Face.
261
- - [ ] Release the 4B continual-pretraining recipe and checkpoint.
262
  - [ ] Expanded evaluation scripts for RULER / NIAH / LongBench v2 reproduction.
263
  - [ ] A reference serving stack with the ring-buffer KV cache.
264
  - [ ] Additional ablations and tutorials.
@@ -292,4 +292,4 @@ KSA is built upon and inspired by the open-source ecosystem. We would like to th
292
  - **HuggingFace Transformers** β€” for the model / tokenizer / generation abstractions that make `trust_remote_code` deployment painless.
293
  - **PyTorch distributed training** β€” for FSDP, DCP, and the communication primitives that make large-scale pretraining tractable.
294
 
295
- We sincerely thank these projects for their outstanding work.
 
48
 
49
  - **2026-04-28** β€” KSA technical report is released on arXiv: [arXiv:2604.24432](https://arxiv.org/abs/2604.24432).
50
  - **2026-04-28** β€” Code, training recipes, block-sparse kernel, and HuggingFace `trust_remote_code` template are open-sourced under this repository.
51
+ - **2026-05-08** β€” [KSA-4B-base](https://huggingface.co/OpenOneRec/KSA-4B-base) (CPT from Qwen3-4B, 128K context) weights are released on HuggingFace.
52
 
53
  ## ✨ Highlights
54
 
 
61
 
62
  ## πŸ€– Model Zoo
63
 
64
+ Pretrained checkpoints published on HuggingFace.
65
 
66
  | Model | Backbone | Parameters | Context | Training | Link |
67
  | :------------ | :---------- | :--------- | :------ | :-------------------- | :---- |
68
+ | KSA-4B-base | Qwen3-4B | 4B | 128k | Continual pretraining | [πŸ€— OpenOneRec/KSA-4B-base](https://huggingface.co/OpenOneRec/KSA-4B-base) |
69
 
70
  The 1.9B *from-scratch* configuration is provided as a reproducible recipe only; no 1.9B weights will be released.
71
 
 
258
  We are actively working on:
259
 
260
  - [x] Technical report on arXiv ([arXiv:2604.24432](https://arxiv.org/abs/2604.24432)).
261
+ - [x] Release the 4B continual-pretraining checkpoint ([KSA-4B-base](https://huggingface.co/OpenOneRec/KSA-4B-base)).
 
262
  - [ ] Expanded evaluation scripts for RULER / NIAH / LongBench v2 reproduction.
263
  - [ ] A reference serving stack with the ring-buffer KV cache.
264
  - [ ] Additional ablations and tutorials.
 
292
  - **HuggingFace Transformers** β€” for the model / tokenizer / generation abstractions that make `trust_remote_code` deployment painless.
293
  - **PyTorch distributed training** β€” for FSDP, DCP, and the communication primitives that make large-scale pretraining tractable.
294
 
295
+ We sincerely thank these projects for their outstanding work.