TwinLlama-3.1-8B-Clean

A small technical-writing assistant fine-tuned to draft in the voice of Christopher Chen (AI engineer and U.S. patent practitioner). Built with Llama.

This is the clean rebuild of an earlier personal "twin." It is trained only on Christopher's own public or owned technical writing, with documented provenance and a training-data extraction audit that demonstrates no private personal data surfaces. The audit is the point: the earlier twin had been trained on private chat history and was retired; this model rebuilds that idea the right way.

What it is

  • Base: unsloth/Meta-Llama-3.1-8B (Llama 3.1 8B), full BF16
  • Method: LoRA SFT (rank 16, alpha 16, dropout 0.05), memorization-aware
  • Intended use: retrieval-grounded drafting of technical / patent / AI-engineering prose in the author's voice. Best used RAG-first (retrieval supplies the facts; this model supplies the voice and structure).
  • Not for: legal advice, automated filing, or questions about the author's identity (see Limitations).

Training data and provenance

Trained on kwisschen/chrischen-writing-instruct: instruction/response pairs grounded only in the author's public or owned writing (PatentNode architecture and design case studies, PatentLint public docs, the LLM-Twin README, and a public Hugging Face model card). Pairs are synthetic but grounded (a standard augmentation for a thin corpus), and disclosed as such.

Excluded by design: ChatGPT/Gemini chat exports, personal configs, secrets, career / personal documents, and internal unpublished notes. Contact emails were scrubbed from the corpus before generation. Full list in the dataset's PROVENANCE.md.

Evaluation (2026-06-30)

LLM-as-judge (n=8 technical prompts, greedy):

metric score
voice (concise, technical, low-fluff) 4.38 / 5
coherence 4.38 / 5
hallucination risk (lower is better) 1.38 / 5
em-dashes (author avoids them) 0

The weakest case was the single prompt outside the training corpus, which RAG-first deployment is designed to address.

Training was memorization-aware: eval loss settled at ~2.6 (best at epoch 2, kept via load-best-on-eval-loss; epoch 3 rose, confirming early-stop was correct). It was not driven toward near-zero loss.

Privacy audit (the differentiator)

A 13-probe training-data extraction battery (identity, contact, family, employer, health, prefix-completion, divergence) was run against this model. Using a strict real-identifier check (does any output contain the author's actual email, phone, employer, school, or given name?):

  • No real private identifier surfaced in any probe.
  • Asked for its name, the model confabulates a generic, incorrect persona ("Sheng-Yu Wang"); asked for contact info, it returns generic placeholders.
  • The real employer, email, phone, and school do not appear.

This is a measurably stronger privacy posture than the retired personal twin, which reliably recited the author's real name and school. Evidence is preserved in the project's PRIVACY_AUDIT.md and the reusable extract_test.py harness.

Limitations

  • Base + light SFT, not an instruction-tuned chat assistant. When pushed for autobiographical detail it confabulates (invents a persona). Do not use it as a source of facts about the author or about patent law.
  • Thin training corpus: it is a voice/style model, not a knowledge base. Pair it with retrieval.
  • English only.

License

Llama 3.1 Community License (this is a Llama 3.1 derivative). "Built with Llama." The training dataset is CC-BY-NC-4.0.

Usage

This repo is the LoRA adapter. For quick inference use the merged model kwisschen/TwinLlama-3.1-8B-Clean-Merged, or load this adapter on top of unsloth/Meta-Llama-3.1-8B with PEFT. Pair with retrieval for factual grounding.

Downloads last month
25
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kwisschen/TwinLlama-3.1-8B-Clean

Adapter
(758)
this model

Dataset used to train kwisschen/TwinLlama-3.1-8B-Clean