chen-yingfa
/

HypeNet-2B

linear-attention

Model card Files Files and versions

chen-yingfa commited on Apr 7

Commit

81a6e00

·

verified ·

1 Parent(s): 1e81bfd

Update README.md

Files changed (1) hide show

README.md +22 -3

README.md CHANGED Viewed

@@ -1,3 +1,22 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+datasets:
+- HuggingFaceFW/fineweb-edu
+language:
+- en
+base_model:
+- Qwen/Qwen3-1.7B
+tags:
+- linear-attention
+- hybrid
+- rnn
+- distillation
+---
+Links:
+- GitHub repo: <https://github.com/thunlp/hybrid-linear-attention>
+- Paper: <https://arxiv.org/abs/2601.22156>
+This is the final HypeNet-2B checkpoint from the paper [Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts](https://arxiv.org/pdf/2601.22156), distilled from Qwen3-1.7B using the HALO pipeline proposed in our paper. For more information, please refer to our GitHub repo.