Carmenest commited on
Commit
0092801
·
verified ·
1 Parent(s): deb1e7f

Add paper DOI reference

Browse files
Files changed (1) hide show
  1. README.md +3 -5
README.md CHANGED
@@ -18,6 +18,8 @@ GGUF quantized versions of [GSAI-ML/LLaDA-8B-Instruct](https://huggingface.co/GS
18
 
19
  LLaDA is a **diffusion language model** that generates text by iterative unmasking rather than autoregressive token-by-token prediction.
20
 
 
 
21
  ## Available Quantizations
22
 
23
  | File | Quant | Size | Description |
@@ -66,9 +68,5 @@ cmake -B build -DCMAKE_BUILD_TYPE=Release
66
  cmake --build build -j$(nproc)
67
 
68
  # Generate with entropy_exit (recommended)
69
- python tools/generate.py \
70
- --model-dir /path/to/LLaDA-8B-Instruct \
71
- --gguf llada-8b-q4km.gguf \
72
- -p "What is the capital of France?" \
73
- -s 16 -t 12 --remasking entropy_exit
74
  ```
 
18
 
19
  LLaDA is a **diffusion language model** that generates text by iterative unmasking rather than autoregressive token-by-token prediction.
20
 
21
+ > **Paper:** [Diffusion Language Models are Faster than Autoregressive on CPU](https://doi.org/10.5281/zenodo.19119814) -- C. Esteban, 2026
22
+
23
  ## Available Quantizations
24
 
25
  | File | Quant | Size | Description |
 
68
  cmake --build build -j$(nproc)
69
 
70
  # Generate with entropy_exit (recommended)
71
+ python tools/generate.py --model-dir /path/to/LLaDA-8B-Instruct --gguf llada-8b-q4km.gguf -p "What is the capital of France?" -s 16 -t 12 --remasking entropy_exit
 
 
 
 
72
  ```