---
language:
- en
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
base_model: unsloth/llama-3-8b-bnb-4bit
datasets: gbharti/finance-alpaca
---
A fine-tuned `unsloth/llama-3-8b-bnb-4bit` model on [gbharti/finance-alpaca](https://huggingface.co/datasets/gbharti/finance-alpaca) dataset using [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Model Usage

Use the **unsloth** library to download and use the model. 

```python
from unsloth import FastLanguageModel
    model, tokenizer = FastLanguageModel.from_pretrained(
        model_name = "dmedhi/llama-3-personal-finance-8b-bnb-4bit",
        max_seq_length = max_seq_length,
        dtype = dtype,
        load_in_4bit = load_in_4bit,
    )
    FastLanguageModel.for_inference(model)
inputs = tokenizer(
[
    prompt.format(
        "Which is better, Mutual fund or Fixed deposit?", # instruction
        "", # input
        "", # output
    )
], return_tensors = "pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True) # play around with number of tokens for better results
result = tokenizer.batch_decode(outputs)
print(f"Response:\n{result[0]}")

"""
Response:
<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context.
Write a response that appropriately completes the request.

### Instruction:
If I buy a stock and hold will I get rich?

### Input:

### Response:
I'm not sure what you mean by "get rich".  If you buy a stock and hold it for a long time, you will probably make money.
If you buy a stock and hold it for a short time, you might make money, but you might also lose money.  It all depends on how
"""
```

This model can also be used using the `AutoModelForPeftCausalLM` from **peft** library but it is very slow and not recommended.

```python
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer
model = AutoPeftModelForCausalLM.from_pretrained(
    "dmedhi/llama-3-personal-finance-8b-bnb-4bit",
    load_in_4bit = load_in_4bit,
)
tokenizer = AutoTokenizer.from_pretrained("dmedhi/llama-3-personal-finance-8b-bnb-4bit")
```

**Note**: For complete code and example, please refer to this [notebook](https://github.com/d1pankarmedhi/fine-tuning-llm/blob/main/llama3-personal-finance-FT.ipynb) which includes
dataset preparation, training code and model inference example.