asusevski
/

mistraloo-sft

PEFT

Safetensors

Model card Files Files and versions Community

asusevski commited on Jan 20

Commit

dcb25f8

•

1 Parent(s): e6ed4f2

Update README.md

Browse files

Files changed (1) hide show

README.md +56 -37

README.md CHANGED Viewed

@@ -5,8 +5,8 @@ base_model: mistralai/Mistral-7B-v0.1
 # Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
 ## Model Details
@@ -17,49 +17,33 @@ base_model: mistralai/Mistral-7B-v0.1
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
 ## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
 ## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
 ### Recommendations
@@ -71,7 +55,42 @@ Users (both direct and downstream) should be made aware of the risks, biases and
 Use the code below to get started with the model.
-[More Information Needed]
 ## Training Details

 # Model Card for Model ID
+LoRA model trained for ~11 hours on r/uwaterloo data.
+Only trained on top-level comments with the most upvotes on each post.
 ## Model Details
+- **Developed by:** Anthony Susevski and Alvin Li
+- **Model type:** LoRA
+- **Language(s) (NLP):** English
+- **License:** 	mit
+- **Finetuned from model [optional]:** mistralai/Mistral-7B-v0.1
 ## Uses
+Pass a post title and a post text(optional) in the style of a Reddit post into the below prompt.
+```
+prompt = f"""
+Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
+### Instruction:
+Respond to the reddit post in the style of a University of Waterloo student.
+### Input:
+{post_title}
+{post_text}
+### Response:
+```
 ## Bias, Risks, and Limitations
+No alignment training as of yet -- only SFT.
 ### Recommendations
 Use the code below to get started with the model.
+```
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import torch
+from peft import PeftModel, PeftConfig
+peft_model_id = "asusevski/mistraloo-sft"
+peft_config = PeftConfig.from_pretrained(peft_model_id)
+model = AutoModelForCausalLM.from_pretrained(peft_config.base_model_name_or_path)
+model = PeftModel.from_pretrained(model, peft_model_id).to(device)
+model.eval()
+tokenizer = AutoTokenizer.from_pretrained(
+    peft_config.base_model_name_or_path,
+    add_bos_token=True
+)
+post_title = "my example post title"
+post_text = "my example post text"
+prompt = f"""
+Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
+### Instruction:
+Respond to the reddit post in the style of a University of Waterloo student.
+### Input:
+{post_title}
+{post_text}
+### Response:
+"""
+model_input = tokenizer(prompt, return_tensors="pt").to(device)
+with torch.no_grad():
+    model_output = model.generate(**model_input, max_new_tokens=256, repetition_penalty=1.15)[0]
+output = tokenizer.decode(model_output, skip_special_tokens=True)
+```
 ## Training Details