dmedhi
/

llama-3-personal-finance-8b-bnb-4bit

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

llama-3-personal-finance-8b-bnb-4bit / README.md

dmedhi's picture

Update README.md

ca7736b verified 5 months ago

|

history blame contribute delete

2.45 kB

	---
	language:
	- en
	license: apache-2.0
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- llama
	- trl
	base_model: unsloth/llama-3-8b-bnb-4bit
	datasets: gbharti/finance-alpaca
	---
	A fine-tuned `unsloth/llama-3-8b-bnb-4bit` model on [gbharti/finance-alpaca](https://huggingface.co/datasets/gbharti/finance-alpaca) dataset using [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

	# Model Usage

	Use the unsloth library to download and use the model.

	```python
	from unsloth import FastLanguageModel
	model, tokenizer = FastLanguageModel.from_pretrained(
	model_name = "dmedhi/llama-3-personal-finance-8b-bnb-4bit",
	max_seq_length = max_seq_length,
	dtype = dtype,
	load_in_4bit = load_in_4bit,
	)
	FastLanguageModel.for_inference(model)
	inputs = tokenizer(
	[
	prompt.format(
	"Which is better, Mutual fund or Fixed deposit?", # instruction
	"", # input
	"", # output
	)
	], return_tensors = "pt").to("cuda")

	outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True) # play around with number of tokens for better results
	result = tokenizer.batch_decode(outputs)
	print(f"Response:\n{result[0]}")

	"""
	Response:
	<\|begin_of_text\|>Below is an instruction that describes a task, paired with an input that provides further context.
	Write a response that appropriately completes the request.

	### Instruction:
	If I buy a stock and hold will I get rich?

	### Input:

	### Response:
	I'm not sure what you mean by "get rich". If you buy a stock and hold it for a long time, you will probably make money.
	If you buy a stock and hold it for a short time, you might make money, but you might also lose money. It all depends on how
	"""
	```

	This model can also be used using the `AutoModelForPeftCausalLM` from peft library but it is very slow and not recommended.

	```python
	from peft import AutoPeftModelForCausalLM
	from transformers import AutoTokenizer
	model = AutoPeftModelForCausalLM.from_pretrained(
	"dmedhi/llama-3-personal-finance-8b-bnb-4bit",
	load_in_4bit = load_in_4bit,
	)
	tokenizer = AutoTokenizer.from_pretrained("dmedhi/llama-3-personal-finance-8b-bnb-4bit")
	```

	Note: For complete code and example, please refer to this [notebook](https://github.com/d1pankarmedhi/fine-tuning-llm/blob/main/llama3-personal-finance-FT.ipynb) which includes
	dataset preparation, training code and model inference example.