🚧 WORK IN PROGRESS — v1 EARLY RELEASE 🚧

A bigger, better model (v2) is in development.

This page is the v1 early release of book-builder-bookwriter-v1. It is not the final model. Skim the rest of this card before you generate anything so you know what you're getting and what you're not.


book-builder-bookwriter-v1

A 7.6B-parameter prose-writing model fine-tuned on 10,211 human-authored novels (310,316 chapters, ~1.82 billion tokens) with a strict story-bible-to-chapter format. Built for BookBuilder to generate novel chapters from structured story bibles.

⚠ Work in Progress (v1, early release)

This is an early checkpoint of an ongoing training run. Training was paused at step 5000 / 9697 (~52% of one epoch) so the artifacts could be released publicly while the larger run continues.

Known limitations of v1:

  • Treats bibles as style prompts, not strict plot instructions. Expect drift from the synopsis on one-shot generation.
  • Does not reliably follow "FORBIDDEN" rules, character role assignments, or per-character constraints in rich BookBuilder-style bibles.
  • Loaded keywords in synopses (proper names like "Stillwater", specific years like "1947") can trigger off-topic associations from the training corpus.
  • On longer generations, may drift into paragraph-level repetition loops without the tuned sampling defaults in ollama/Modelfile.

What's coming:

  • v2 (Q3/Q4 2026): Larger base model (Qwen 2.5 14B or 32B) + synthesized instruction-following data so the model can actually obey FORBIDDEN sections, distinguish protagonists from antagonists, and follow beat sheets. This addresses the main v1 limitation.
  • v1 continuation to step 9697: the LoRA may be taken to the full single-epoch checkpoint and re-released as book-builder-bookwriter-v1.1 if testing shows it's worth the additional training compute. The resumable training state is preserved at branch resumable-step-5000.

Best use for v1 right now: as a prose-style backbone inside a structured pipeline (like BookBuilder itself) that provides per-chapter beat sheets and plot anchors at generation time. The pipeline supplies the discipline the v1 model can't enforce on its own. For one-shot "give me a chapter from a synopsis" use, v1 produces readable prose but will frequently drift from the intended plot.

This is NOT a chat model. Do not prompt it like ChatGPT. See "How to use" below.

What this model does

You write a Story Bible in the format shown below (or fill in the template). You give the model the bible plus ### Chapter. The model writes the chapter prose.

It will not answer questions. It will not respond to "Write me a story about X." It only continues prose conditioned on the bible context.

Format the model expects

Every training example looked exactly like this:

### Bible
Title: [book title]
Author: [author name]
Genre: [genre]
Publisher: [publisher]
Synopsis: [1-3 paragraphs describing the book]

### Genre
[genre]



### Chapter
[chapter title]

[chapter prose...]

Your prompt MUST end at ### Chapter\n[chapter title]\n\n and the model fills in the prose.

How to use

Option 1: Ollama (easiest)

IMPORTANT: A plain ollama pull from HF discards any Modelfile parameters in the repo, so Ollama runs the model with its default sampling — which on long completion prompts causes repetition loops and over-long generations. Use the setup script below to register the model under the name bookbuilder with tuned defaults that prevent both problems.

One-shot setup:

curl -sSL https://huggingface.co/Fordentinc/book-builder-bookwriter-v1/resolve/main/ollama/setup_ollama.sh | bash
# Optional: pass a quant tag, default is Q5_K_M
# curl -sSL https://huggingface.co/Fordentinc/book-builder-bookwriter-v1/resolve/main/ollama/setup_ollama.sh | bash -s Q8_0

Then:

ollama run bookbuilder < your_bible.txt

The script pulls the GGUF, builds a local Modelfile with the right repeat_penalty 1.18, num_predict 2500, and stop tokens, and registers the result as bookbuilder.

If you'd rather pull manually (without the loop fix), you need to pass sampling flags every time:

ollama pull hf.co/Fordentinc/book-builder-bookwriter-v1:Q5_K_M
ollama run hf.co/Fordentinc/book-builder-bookwriter-v1:Q5_K_M \
  --num-predict 2500 --repeat-penalty 1.18 --temperature 0.75

Available quants:

  • Q4_K_M (4.7 GB) - fits on 8 GB GPUs, some token-decoding artifacts
  • Q5_K_M (5.4 GB) - balanced, recommended for 24 GB cards
  • Q8_0 (8.1 GB) - near-lossless, cleaner token decoding than K-quants
  • F16 (15.2 GB) - full precision, no quantization artifacts

Option 2: LM Studio

  1. Search for Fordentinc/book-builder-bookwriter-v1
  2. Download the Q5_K_M quant
  3. Switch to Completion mode (not Chat). This is critical.
  4. Paste your filled-in bible as the input
  5. Generate

Option 3: llama.cpp

./llama-cli -hf Fordentinc/book-builder-bookwriter-v1:Q5_K_M \
  --temp 0.8 --top-p 0.95 -n 2048 \
  -f your_bible.txt

Option 4: Transformers + PEFT (Python, full bf16)

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

base_id = "Qwen/Qwen2.5-7B"
adapter = "Fordentinc/book-builder-bookwriter-v1"

tok = AutoTokenizer.from_pretrained(adapter)
base = AutoModelForCausalLM.from_pretrained(base_id, torch_dtype=torch.bfloat16, device_map="auto")
model = PeftModel.from_pretrained(base, adapter)
model.eval()

bible_and_chapter_header = open("your_bible.txt").read()

inputs = tok(bible_and_chapter_header, return_tensors="pt").to("cuda")
out = model.generate(
    **inputs,
    max_new_tokens=2048,
    do_sample=True, temperature=0.8, top_p=0.95,
    repetition_penalty=1.05,
)
print(tok.decode(out[0], skip_special_tokens=True))

Option 5: vLLM (production / OpenAI-compatible API)

vLLM cannot load the LoRA adapter alone; use the merged bf16 weights instead:

vllm serve Fordentinc/book-builder-bookwriter-v1 --dtype bfloat16 --max-model-len 16384

Step-by-step: from blank page to chapter

  1. Download the template: bible_template.txt
  2. Fill in every field. The model relies on each section to anchor character voices, setting, and tone.
  3. Save as plain text (e.g. my_book.txt).
  4. End the file with ### Chapter followed by your chapter title and one blank line. Example:
    ### Chapter
    Chapter 1: The Long Drive Home
    
  5. Run inference using one of the options above.
  6. The model writes ~1500-4000 words of prose, then stops or hits your max_new_tokens cap.
  7. For chapter 2: keep the same bible, change the chapter header, optionally append the last paragraph of chapter 1 so the model continues smoothly.

See example_bibles/ for two complete working examples.

Recommended sampling parameters

Parameter Value Why
temperature 0.8 Lower = repetitive, higher = incoherent
top_p 0.95 Standard nucleus sampling
repetition_penalty 1.05 Prevents loops; do not push past 1.15
max_new_tokens 2048-4096 Most chapters land in 1500-3500 tokens
min_p 0.05 (if supported) Better than top_k for prose

Training details

  • Base model: Qwen 2.5 7B (Apache 2.0)
  • Method: QLoRA, r=16, alpha=32, dropout 0.05
  • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Training corpus: 10,211 novels with derivable story bibles (Sci-Fi, Thriller, Romance, Crime, Western, Fantasy, etc.)
  • Corpus stats: 310,316 chapter rows, ~1.4 billion words, ~1.82 billion tokens
  • Hardware: 1× NVIDIA B200 (180 GB HBM3e)
  • Sequence length: 2048
  • Effective batch: 32 (per-device 4, grad-accum 8)
  • Optimizer: paged_adamw_8bit
  • LR: 2e-4 cosine, 3% warmup
  • Epochs: 1
  • Wall time: ~13.5 hours
  • Data cleanup: em-dashes removed (replaced with commas), smart-quotes normalized, residual front-matter stripped, chapters with <800 or >6000 words filtered out

Quirks and limitations

  • No system messages, no chat history, no [INST] tags. The model was never shown those during training.
  • Bibles outside its training distribution (highly experimental forms, non-Western names, modern slang heavy) may produce uneven results.
  • Names follow Western conventions (USA/UK/Italy/Western Europe). The training filter excluded other naming traditions.
  • Em-dashes are absent from training data and the model will rarely produce them. This is intentional.
  • Chapter length is learned from data (avg ~4,500 words). To force shorter chapters, cap max_new_tokens.

License

Apache 2.0 (inherited from base Qwen 2.5 7B). You may use commercially.

Citation

@misc{bookbuilder_bookwriter_v1_2026,
  author = {Fordentinc},
  title  = {book-builder-bookwriter-v1: A prose-writing LoRA on Qwen 2.5 7B},
  year   = {2026},
  url    = {https://huggingface.co/Fordentinc/book-builder-bookwriter-v1},
}

Reporting issues

Open a discussion on this model page. Include the bible you used (first 500 chars) and the first 200 chars of the model output.

Downloads last month
197
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
Input a message to start chatting with Fordentinc/book-builder-bookwriter-v1.

Model tree for Fordentinc/book-builder-bookwriter-v1

Base model

Qwen/Qwen2.5-7B
Quantized
(82)
this model