---
datasets:
- msaad02/formatted-ss-cleaned-brockport-qa
language:
- en
pipeline_tag: text-generation
---

# BrockportGPT LLaMA2 Finetune

This model is part of my thesis to make a question-answering bot for SUNY Brockport. The data used for finetuning is a subset of the question database I generated, and is also available on my huggingface account.

For more information, see my [github repo](https://github.com/msaad02/honors-thesis) for the project.

## Examples

>Are SAT scores required?
>>SAT scores are not required for admission, but test scores are used as part of Brockport's holistic review process to make admissions decisions.

>Who can I contact for help with financial aid?
>>You can contact the financial aid office at SUNY Brockport for assistance with financial aid. They are available to answer any questions and help you navigate the financial aid process.


## Prompting

This prompting style is what was used for finetuning:

>Below is an inquiry related to SUNY Brockport - from academics, admissions, and faculty support to student life. Prioritize accuracy and brevity.
>
>\### Instruction:\
>{question}
>
>\### Response:\
>{response}

## Usage

To run this model, I suggest using the following code to load the model in 4-bit and build a pipeline, which implements the correct prompt formatting

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import textwrap

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

model = AutoModelForCausalLM.from_pretrained(
    "msaad02/llama2_7b_brockportgpt",
    quantization_config=bnb_config,
    device_map={"": 0},
    trust_remote_code=True
)

tokenizer = AutoTokenizer.from_pretrained("msaad02/llama2_7b_brockportgpt")

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline(
    task="text-generation", 
    model=model, 
    tokenizer=tokenizer
)

def qa(text: str, full = False):
    # textwrap.dedent gets rid of indenting at the start of each newline
    text = textwrap.dedent(f"""\
        Below is an inquiry related to SUNY Brockport - from academics, admissions, and faculty support to student life. Prioritize accuracy and brevity.

        ### Instruction:
        {text}

        ### Response:
        """)
    
    response = pipe(text, max_length=100, do_sample=True, top_k=50, top_p=0.95, temperature=1.0)
    response = response[0]['generated_text']
    response = response.split("### Response:\n")[1] if not full else response

    return response

print(qa("How do I apply?", full = True))
```