dmedhi commited on
Commit
4bea6f6
1 Parent(s): 09031b3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -2
README.md CHANGED
@@ -9,6 +9,7 @@ tags:
9
  - llama
10
  - trl
11
  base_model: unsloth/llama-3-8b-bnb-4bit
 
12
  ---
13
 
14
  # Uploaded model
@@ -17,6 +18,62 @@ base_model: unsloth/llama-3-8b-bnb-4bit
17
  - **License:** apache-2.0
18
  - **Finetuned from model :** unsloth/llama-3-8b-bnb-4bit
19
 
20
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  - llama
10
  - trl
11
  base_model: unsloth/llama-3-8b-bnb-4bit
12
+ datasets: gbharti/finance-alpaca
13
  ---
14
 
15
  # Uploaded model
 
18
  - **License:** apache-2.0
19
  - **Finetuned from model :** unsloth/llama-3-8b-bnb-4bit
20
 
21
+ This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
22
+ A fine-tuned `unsloth/llama-3-8b-bnb-4bit` model on [gbharti/finance-alpaca](https://huggingface.co/datasets/gbharti/finance-alpaca) dataset.
23
 
24
+ # Model Usage
25
+
26
+ Use the **unsloth** library to download and use the model.
27
+
28
+ ```python
29
+ from unsloth import FastLanguageModel
30
+ model, tokenizer = FastLanguageModel.from_pretrained(
31
+ model_name = "dmedhi/llama-3-personal-finance-8b-bnb-4bit",
32
+ max_seq_length = max_seq_length,
33
+ dtype = dtype,
34
+ load_in_4bit = load_in_4bit,
35
+ )
36
+ FastLanguageModel.for_inference(model)
37
+ inputs = tokenizer(
38
+ [
39
+ prompt.format(
40
+ "Which is better, Mutual fund or Fixed deposit?", # instruction
41
+ "", # input
42
+ "", # output
43
+ )
44
+ ], return_tensors = "pt").to("cuda")
45
+
46
+ outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True) # play around with number of tokens for better results
47
+ result = tokenizer.batch_decode(outputs)
48
+ print(f"Response:\n{result[0]}")
49
+
50
+ """
51
+ Response:
52
+ <|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context.
53
+ Write a response that appropriately completes the request.
54
+
55
+ ### Instruction:
56
+ If I buy a stock and hold will I get rich?
57
+
58
+ ### Input:
59
+
60
+ ### Response:
61
+ I'm not sure what you mean by "get rich". If you buy a stock and hold it for a long time, you will probably make money.
62
+ If you buy a stock and hold it for a short time, you might make money, but you might also lose money. It all depends on how
63
+ """
64
+ ```
65
+
66
+ This model can also be used using the `AutoModelForPeftCausalLM` from **peft** library but it is very slow and not recommended.
67
+
68
+ ```python
69
+ from peft import AutoPeftModelForCausalLM
70
+ from transformers import AutoTokenizer
71
+ model = AutoPeftModelForCausalLM.from_pretrained(
72
+ "dmedhi/llama-3-personal-finance-8b-bnb-4bit",
73
+ load_in_4bit = load_in_4bit,
74
+ )
75
+ tokenizer = AutoTokenizer.from_pretrained("lora_model")
76
+ ```
77
+
78
+ For complete code and example, please refer to this [notebook](https://github.com/d1pankarmedhi/fine-tuning-llm/blob/main/llama3-personal-finance-FT.ipynb) which includes
79
+ dataset preparation, training code and model inference example.