Edit model card

Write short titles for text documents on your computer -- quickly

That's the goal with this small 3b-parameter model based on outdated architecture. It was finetuned on the 1000+ personal notes I have on my own computer, a fair number of which are titled; the model should be able to handle inputs of varying lengths.

Example input and output (the model is trained on completions only using SFTTrainer, and in this example generated everything below ''### Title:'):

### Instruction:
You are an expert title-writing AI that, given the body text of a quick note made about some subject, writes a title that accurately and concisely summarizes the content of that note. If the body text is not obviously a note -- for instance, it is some code, or an essay -- your title will describe what kind of code or essay it is.

### Body text of note:
Things needed for solitaire or other solo game reinforcement learning:

A simulated environment that can be stepped and returns the next state, the rewards, and whether or not the sim is terminated. 

A discrete list of actions that can be taken

A neural network for predicting Q values. And a duplicate, "target" one.

A function that computes the loss of a Q function

a function that uses gradient tape to track the application of the above function, calculate the gradients for the given trainable variables, and apply these gradients. A helper to do a soft update can be used here

A data structure that stores states, "done" status, and rewards. This is the memory buffer. It should have a specific size.

Mini buffer?

helper to pick either exploited best value or random thing with epsilon % chance

### Title:
solitude reinforcement learning.txt</s>

Since the model is small, it should be able to handle bulk tasks better than some larger models because it can run faster. Especially if quantized. The purpose of building this model was to help myself organize and search through the vast number of untitled text documents on my computer, and now you can too! Its true utility may come about if quantized, which I still need to figure out how to do.

Legal

Uses the bloomz rail license, which essentially prevents you from using the model for evil, but that's not a very precise description on my part... so just go read the license if you use this.

Prompt template (for further finetuning) (this is the exact code I used when generating examples to be passed to the prompt):

### Instruction:
You are an expert title-writing AI that, given the body text of a quick note made about some subject, writes a title that accurately and concisely summarizes the content of that note. If the body text is not obviously a note -- for instance, it is some code, or an essay -- your title will describe what kind of code or essay it is.

### Body text of note:
{row['content']}.

### Title:
{row['title'].replace(re.escape('.txt'),'') + tokenizer.eos_token}

No, I don't know why it still writes titles with .txt if I tried to remove that from my training data. Oh well.

Known issues

  • It will probably assume every file is a .txt file, since it was only trained on those.
  • I don't know how it handles a wide variety of formats, I've literally only tested it on two files right now.
  • I didn't know that bloomz has a bos_token, and the model was finetuned without that.

If you want, you can finetune this model on your specific use case or document format, to organize your own work!

Just get GPT-4 to write you a script that calls it on the body text of files in a specified directory and renames them to the model's output and you're good. I haven't actually done the bulk rename of my files myself yet, so I don't have a script to share. Once I do have a script to share, I'll add it here.

This is the first model that I pushed to Hugging Face! if you have any comments or feedback, please do message me, or find me on Discord. I'm also called Heralax over there. You can find me in TheBloke's Discord if you look hard enough.

Downloads last month
2