m1ngcheng nielsr HF Staff commited on
Commit
d7083f3
·
verified ·
1 Parent(s): 88acb1f

Improve model card with paper, code, project links and pipeline tag (#3)

Browse files

- Improve model card with paper, code, project links and pipeline tag (64a200fd87066bf0e54e87bd0c91518ac44c6177)


Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +9 -5
README.md CHANGED
@@ -1,14 +1,22 @@
1
  ---
 
2
  license: apache-2.0
3
  tags:
4
  - dllm
5
  - diffusion
6
  - llm
7
  - text_generation
8
- library_name: transformers
9
  ---
 
10
  # LLaDA-MoE
11
 
 
 
 
 
 
 
12
  **LLaDA-MoE** is a new and upgraded series of the LLaDA diffusion language model. This pre-release includes two cutting-edge models:
13
 
14
  - `LLaDA-MoE-7B-A1B-Base`: A base pre-trained model designed for research and secondary development.
@@ -223,10 +231,6 @@ input_ids = torch.tensor(input_ids).to(device).unsqueeze(0)
223
 
224
  text = generate(model, input_ids, steps=128, gen_length=128, block_length=32, temperature=0., cfg_scale=0., remasking='low_confidence')
225
  print(tokenizer.batch_decode(text[:, input_ids.shape[1]:], skip_special_tokens=False)[0])
226
-
227
-
228
-
229
-
230
  ```
231
 
232
 
 
1
  ---
2
+ library_name: transformers
3
  license: apache-2.0
4
  tags:
5
  - dllm
6
  - diffusion
7
  - llm
8
  - text_generation
9
+ pipeline_tag: text-generation
10
  ---
11
+
12
  # LLaDA-MoE
13
 
14
+ This model is based on the principles described in the paper [Large Language Diffusion Models](https://huggingface.co/papers/2502.09992).
15
+
16
+ - 📚 [Paper](https://huggingface.co/papers/2502.09992)
17
+ - 🏠 [Project Page](https://ml-gsai.github.io/LLaDA-demo/)
18
+ - 💻 [Code](https://github.com/ML-GSAI/LLaDA)
19
+
20
  **LLaDA-MoE** is a new and upgraded series of the LLaDA diffusion language model. This pre-release includes two cutting-edge models:
21
 
22
  - `LLaDA-MoE-7B-A1B-Base`: A base pre-trained model designed for research and secondary development.
 
231
 
232
  text = generate(model, input_ids, steps=128, gen_length=128, block_length=32, temperature=0., cfg_scale=0., remasking='low_confidence')
233
  print(tokenizer.batch_decode(text[:, input_ids.shape[1]:], skip_special_tokens=False)[0])
 
 
 
 
234
  ```
235
 
236