Kamisori-daijin/email-datasets-20k
Viewer • Updated • 19.8k • 42 • 2
Inspired from 28m model email experiment, it's a post made on r/LocalLLaMA on reddit.
Trained on email-datasets-20k. This is the exact dataset which was used in that experiment.
Make sure you have uv installed have downloaded the repo, email.strawberry & cl8k.bin.
email.strawberry is the model and cl8k.bin is the tokenizer.
uv run src/sample.py -h will give you this output:
usage: sample.py [-h] --model MODEL --encoder ENCODER [--length LENGTH] [--temperature TEMPERATURE] [--top_k TOP_K]
[--text_prompt TEXT_PROMPT]
A powerful text encryption and decryption program.
options:
-h, --help show this help message and exit
--model, -i MODEL model path
--encoder, -e ENCODER
encoder path
--length, -l LENGTH output length
--temperature, -t TEMPERATURE
output temperature
--top_k, -f TOP_K output top_k
--text_prompt, -T TEXT_PROMPT
Text input from the command line.
uv run src/sample.py -i email.strawberry -e cl8k.bin -T "Write a" will you output something like this:> Write a
Write a apologetic and humble business email from a Legal Counsel to a Recruiter regarding proposing a joint webinar, specifically while the system is partially down.<|eop|>
Regarding Recent Server Outage - [Company Name] and Integration
Dear [Recruiter Name],
I am writing to sincerely apologize for the recent server outage that occurred during the [Hackathon Name] hackathon. We recently implemented [mention the issue].
Due to this Serruption, we’ve been working diligently to mitigate this dissemination of our system, which has unfortunately impacted our ability to report this unacceptable correcting. Recognizing the significantly impacting [Industry Crisis], a broader experience experienced an unforeseen issue impacting the scale of this contract. We understand the enthusiasm and the demands of this investment.
We’ve already implemented a minor absence to process the potential disruption and a deepfaken
out.txt contains the training logs.Input tokens
|
[Token Embedding]
|
[2x Strawberry Blocks] # 2 layers
|--- Scaled Dot Product Attention
| |--- Rotary Positional Embeddings
| |--- QK Norm
| |--- Multi-Headed Attention
|--- SiLU non-linearity
|--- Scaled Dot Product Attention
| |--- Rotary Positional Embeddings
| |--- QK Norm
| |--- Multi-Headed Attention
| |
[Output Projection (weight-tied)]
|
Next token logits