ThinkTim21 commited on
Commit
3abf4ad
·
verified ·
1 Parent(s): 85af420

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -35
README.md CHANGED
@@ -5,13 +5,13 @@ tags: []
5
 
6
  ![image](FInPlan-1_Image.png)
7
 
8
- The image above was created with Chat GPT 4o using a link this model repository, and a brief prompt.
9
 
10
 
11
  # FinPlan-1
12
 
13
  FinPlan-1 is a an LLM trained to assist with the creation of basic personal financial plans for individuals. This model is built off of the
14
- Fino1 model which is itself a version of Llama-3.1-8B-Instruct, which was CoT fine tuned to improve its finaicial reasoning ability.
15
 
16
 
17
 
@@ -28,18 +28,18 @@ The financial planning component is one area I think LLMs can be of assistance.
28
  on financial reasoning tasks to assist individuals with two key aspects of financial planning.
29
 
30
  1. Assist with the creation of a budget spreadsheet to enable individuals to keep track of their finances and understand where their money is going.
31
- 2. Provide assistance with planning for short, medium and long term goals including breaking those goals down into monthly savings targets, and suggesting broad investment vehicles to fit each goal's timeframe.
32
 
33
- While current LLM's can perform these tasks to an extent, they are often inconsistent with their responce structure, can sometimes struggle with breaking down basic mathematics
34
- and frequently go beyond the basic tasks at hand reccomending inappropriate savings and investiment vehicles for individual savings goals. The Fino-1 8B model is certainly well
35
- trained for the corporate financial reasoning tasks but its reccomendations for savings and investment vehicles were often too agressive for short term goals and may reccomend
36
- long term savings vehicles which carry tax penalties if not used approporately. This model uses LoRA on a proceedureally generated budgeting dataset as well as few shot prompting using
37
- a separate dataset based around short, medium and long term goals to enchance the ability of Fino-1 8B to accomplish these tasks.
38
 
39
  The results of this training and prompting method are encouraging as the model consistently produces budget spreadsheets (through the generation of executable python code)
40
  as well as somewhat reliable savings plan assistance with the use of few shot prompting. These training methods do have an impact on this model's performance on standard
41
- benchmarks like gsm8k and mmlu resulting in drops in performance on both tasks compared with the base model, however this loss in generalization is made up for in the model's
42
- improved ability to accomplish the tasks of assisting indivudals with budgeting and fixed term savinings goals.
43
 
44
 
45
  - **Developed by:** Timothy Austin Rodriguez
@@ -52,25 +52,25 @@ improved ability to accomplish the tasks of assisting indivudals with budgeting
52
  ### Training Data
53
 
54
  This model is trained on a procedurally generated synthetic dataset that provides structured prompts and responses to assist the underlying Fino-1 8B model
55
- with creating executable python code which creates and exports budget spreadsheet to a Microsoft Excel .xlsx format. THis dataset (attached to this repository) is comprised
56
  of 3000 examples which were divided into a train/validation split of 2500 for training and 500 for validation. The code used to create and randomize this dataset including
57
  the seeds (42 for randomization, 60 for creation) can be located in the ipynb files attached to this repository. This dataset is called budget_dataset.csv
58
 
59
- While not used for trianing this model, a secondary dataset for the purposes of improving the model's performance on short, medium and long term goal planning was developed
60
  via procedural generation. This dataset was generated much like the first through random procedural generation of 3000 examples of prompts and responses. random seeds, and
61
  train/validation split code can be located in the same ipynb file as the budget dataset. This dataset is called goals_dataset.csv. This dataset was not used to train the final
62
- model due to poor performance encountered when leveraging LoRA for addtional training. The model actually performed worse when prompted with an example from the validation dataset
63
- after training than before training. A deeper exploration of why this occured is warrented and other training/tuning methods should be considered beyond LoRA for future enhancement
64
  of this model.
65
 
66
  ## Training Method
67
 
68
  The method of training/tuning for this model is the Parameter-Efficient Fine-Tuning method called Low-Rank Adaptation or LoRA. LoRA is a fine tuning approach that is well
69
  suited to tuning a model for domain specific tasks such as creating personal financial plans. LoRA is significantly more efficient than full fine tuning requiring fewer compute
70
- resources and is much more memory efficient as fewere model weights are changed. In many cases LoRA implementation yeilds results very similar to full fine tuning without the
71
- heavy computational expense inherent with full fine tuning. This method was chosen given the time allocated for training this model, limited compute resouces due to competing
72
- requests for GPU time on the University of Virginia's Rivanna High Performance Computing cluster and the desire to have similar results to full fine tuning desptie the lack of
73
- compute resouces required. LoRA Tuning hyperparameter values were selected through experimentation and can be found in one of the ipynb files attached to this repository and in
74
  the summary below.
75
 
76
  Hyperparameters
@@ -83,8 +83,8 @@ Tuning/Training Settings
83
  - epochs = 5
84
 
85
  Secondarily, this model makes use of Few Shot Prompting due to the aforementioned poor performance of LoRA when training on the goals dataset. It was found that few shot
86
- prompting improves the ability of the model to provide the desired response structure without degrading the model's performance as was noted with LoRA implementation reguardless
87
- of the Hyperparameters that were selected. Examples code for how to implement the appropriate few shot prompting is availabe in one of the provide ipynb files in this repository.
88
 
89
 
90
  ## Evaluation
@@ -98,21 +98,21 @@ of the Hyperparameters that were selected. Examples code for how to implement th
98
 
99
 
100
  The benchmarks chosen, GSM8K, MMLU and the two synthetic dataset examples were selected to provide a view of the performance of the model both in terms of its generalization
101
- ability as well as it's ability to perform the tasks it is trained to accomplish. As the underlying model that FinPlan-1 is based on, Fino-1 8B is a natural comparsion model
102
  to evaluate for benchmarking. Further, the Llama 3.2-3B Instruct model is a newer version of the model which underlies Fino-1 8B albeit a smaller version parameter wise. Given
103
- this model's rather decent performance on the financial planning tasks it serves as a good comparsion for FinPlan-1. Finally Ministral 8B instruct -2410 model is of comparable
104
- size parameter wise to FinPlan-1 and was originally considered as a potential base model to train for FinPlan-1, thus making it a good model for comparison.Since the tasks this model is tuned to accomplish are non standard and domain specific, the
105
  benchmark for these tasks comes from the validation/hold out split of the training dataset and its evaluation is somewhat subjective. For each of these models, the Budget and Goals examples were
106
  presented to the model in either a zero shot prompt (budget) or a three shot prompt (goals). Only the trained FinPlan-1 model was able to provide the desired format for the excel file
107
  for the budget task while both Fino-1 8B and FinPlan-1 performed well on the goals dataset. For measurement of generalizability and retention of reasoning skill, all four models
108
- were benchmarked on GSM8K (grade school mathematics reasoning) as well as MMLU (general reasoning). While the domain specific LoRA tuning certainly led to a degredation in FinPlan-1's
109
  benchmark scores with respect to its underlying model Fino-1 8B, the drop in performance is rather small for MMLU and GSM8K performance remains above Llama 3.2 -3B Instruct.
110
 
111
- ## Intended Useage
112
 
113
  As described above this model is intended to be used to assist with the creation of simple financial plans for individuals, specifically for assistance with the creation of a budget
114
- spreadsheet for tracking expenseses as well as planning for, short, medium and long term savings goals. While this model can be prompted on a wide range of other tasks, it is
115
- not recommended to use this model for those purposes as it has been speficially fine tuned for these two tasks and perforamnce on tasks outside that scope could be diminished.
116
 
117
  See below for the basic code required in order to import the model from huggingface using torch. Note the tokenizer is pulled from the Fino-1 8B repository as it was not changed
118
  from the base Fino-1 8B model.
@@ -180,13 +180,13 @@ print(generated_text)
180
 
181
  The prompt format varies between the budget task and the goals task.
182
 
183
- For the budget task, the following prompt method is reccomended.
184
 
185
  ```{python}\n
186
  Q: I have an income of about 53255 a year and my monthly expenses include 2208 a month in rent and utilities, a 700 car payment, $300 in food, and about 205 a month in other expenses. Using python, can you create for me a budget spreadsheet and export it to excel?
187
  ```
188
 
189
- For the goals task, I reccomend using Few Shot Prompting, making use of the goals_dataset.csv file as your base and then adding your prefered prompt in the following format
190
  to the few shot examples derived from the goals dataset.
191
 
192
  ```{python}\n
@@ -194,7 +194,7 @@ Q: My short term goal is to save for a $3357 vacation in the next year, my mediu
194
  ```
195
 
196
 
197
- I reccomend the following code to set up few shot prompting for the goals task:
198
 
199
  ```{python}\n
200
 
@@ -312,13 +312,13 @@ Finally, to make saving easier, I'll set up automatic transfers from my checking
312
 
313
  There are several risks and limitations of this model that are worth mentioning. First, in a handful of cases this model produced responses in which the math inherent in the
314
  savings goals responses was not correct, sometimes failing to add numbers up correctly or having slight rounding errors when dividing long term goals into monthly targets.
315
- While it is well known that LLMs can struggle with mathematics given that their knolwedge is language based and not numerically based, this can be a problem for a finance focused LLM.
316
- I strongly reccomend double checking the figures presented by this model. While this issue is sidestepped in the budget task through the use of python code to prevent math errros
317
  that safeguard is not implemented for the goals task. Further, this model should be limited in its use for out of scope tasks as the generalization benchmarks demonstrated that
318
  compared to its base model, this model exhibits decreased reasoning ability outside its domain specific task.
319
 
320
- In order to imprve theis model I woudl reccomend future model trainers and tuners focus on adjusting this model to default to producing python code for all mathematics based
321
- prompts. Sticking with python for mathematics processing shoula allow the model to perform more highly on the goals task while retaining performance on the
322
 
323
  [More Information Needed]
324
 
@@ -355,4 +355,4 @@ Timothy Austin Rodriguez
355
 
356
  ## Model Card Contact
357
 
358
- tar3kh@virginia.edu
 
5
 
6
  ![image](FInPlan-1_Image.png)
7
 
8
+ Created with Chat GPT 4o using a link to this model repository, and a brief prompt.
9
 
10
 
11
  # FinPlan-1
12
 
13
  FinPlan-1 is a an LLM trained to assist with the creation of basic personal financial plans for individuals. This model is built off of the
14
+ Fino1 model which is itself a version of Llama-3.1-8B-Instruct, which was CoT fine-tuned to improve its financial reasoning ability.
15
 
16
 
17
 
 
28
  on financial reasoning tasks to assist individuals with two key aspects of financial planning.
29
 
30
  1. Assist with the creation of a budget spreadsheet to enable individuals to keep track of their finances and understand where their money is going.
31
+ 2. Aid with planning for short, medium and long term goals including breaking those goals down into monthly savings targets, and suggesting broad investment vehicles to fit each goal's timeframe.
32
 
33
+ While current LLM's can perform these tasks to an extent, they are often inconsistent with their response structure, can sometimes struggle with breaking down basic mathematics
34
+ and frequently go beyond the basic tasks at hand recommending inappropriate savings and investment vehicles for individual savings goals. The Fino-1 8B model is certainly well
35
+ trained for the corporate financial reasoning tasks but its recommendations for savings and investment vehicles were often too aggressive for short term goals and may recommend
36
+ long term savings vehicles which carry tax penalties if not used appropriately. This model uses LoRA on a procedurally generated budgeting dataset as well as few shot prompting using
37
+ a separate dataset based around short, medium and long term goals to enhance the ability of Fino-1 8B to accomplish these tasks.
38
 
39
  The results of this training and prompting method are encouraging as the model consistently produces budget spreadsheets (through the generation of executable python code)
40
  as well as somewhat reliable savings plan assistance with the use of few shot prompting. These training methods do have an impact on this model's performance on standard
41
+ benchmarks like GSM8K and MMLU resulting in drops in performance on both tasks compared with the base model, however this loss in generalization is made up for in the model's
42
+ improved ability to accomplish the tasks of assisting individuals with budgeting and fixed term savings goals.
43
 
44
 
45
  - **Developed by:** Timothy Austin Rodriguez
 
52
  ### Training Data
53
 
54
  This model is trained on a procedurally generated synthetic dataset that provides structured prompts and responses to assist the underlying Fino-1 8B model
55
+ with creating executable python code which creates and exports budget spreadsheet to a Microsoft Excel .xlsx format. This dataset (attached to this repository) is comprised
56
  of 3000 examples which were divided into a train/validation split of 2500 for training and 500 for validation. The code used to create and randomize this dataset including
57
  the seeds (42 for randomization, 60 for creation) can be located in the ipynb files attached to this repository. This dataset is called budget_dataset.csv
58
 
59
+ While not used for training this model, a secondary dataset for the purposes of improving the model's performance on short, medium and long term goal planning was developed
60
  via procedural generation. This dataset was generated much like the first through random procedural generation of 3000 examples of prompts and responses. random seeds, and
61
  train/validation split code can be located in the same ipynb file as the budget dataset. This dataset is called goals_dataset.csv. This dataset was not used to train the final
62
+ model due to poor performance encountered when leveraging LoRA for additional training. The model actually performed worse when prompted with an example from the validation dataset
63
+ after training than before training. A deeper exploration of why this occurred is warranted and other training/tuning methods should be considered beyond LoRA for future enhancement
64
  of this model.
65
 
66
  ## Training Method
67
 
68
  The method of training/tuning for this model is the Parameter-Efficient Fine-Tuning method called Low-Rank Adaptation or LoRA. LoRA is a fine tuning approach that is well
69
  suited to tuning a model for domain specific tasks such as creating personal financial plans. LoRA is significantly more efficient than full fine tuning requiring fewer compute
70
+ resources and is much more memory efficient as fewer model weights are changed. In many cases LoRA implementation yields results very similar to full fine tuning without the
71
+ heavy computational expense inherent with full fine tuning. This method was chosen given the time allocated for training this model, limited compute resources due to competing
72
+ requests for GPU time on the University of Virginia's Rivanna High Performance Computing cluster and the desire to have similar results to full fine tuning despite the lack of
73
+ compute resources required. LoRA Tuning hyperparameter values were selected through experimentation and can be found in one of the ipynb files attached to this repository and in
74
  the summary below.
75
 
76
  Hyperparameters
 
83
  - epochs = 5
84
 
85
  Secondarily, this model makes use of Few Shot Prompting due to the aforementioned poor performance of LoRA when training on the goals dataset. It was found that few shot
86
+ prompting improves the ability of the model to provide the desired response structure without degrading the model's performance as was noted with LoRA implementation regardless
87
+ of the Hyperparameters that were selected. Examples code for how to implement the appropriate few shot prompting is available in one of the provide ipynb files in this repository.
88
 
89
 
90
  ## Evaluation
 
98
 
99
 
100
  The benchmarks chosen, GSM8K, MMLU and the two synthetic dataset examples were selected to provide a view of the performance of the model both in terms of its generalization
101
+ ability as well as it's ability to perform the tasks it is trained to accomplish. As the underlying model that FinPlan-1 is based on, Fino-1 8B is a natural comparison model
102
  to evaluate for benchmarking. Further, the Llama 3.2-3B Instruct model is a newer version of the model which underlies Fino-1 8B albeit a smaller version parameter wise. Given
103
+ this model's rather decent performance on the financial planning tasks it serves as a good comparison for FinPlan-1. Finally Ministral 8B instruct -2410 model is of comparable
104
+ size parameter wise to FinPlan-1 and was originally considered as a potential base model to train for FinPlan-1, thus making it a good model for comparison. Since the tasks this model is tuned to accomplish are non standard and domain specific, the
105
  benchmark for these tasks comes from the validation/hold out split of the training dataset and its evaluation is somewhat subjective. For each of these models, the Budget and Goals examples were
106
  presented to the model in either a zero shot prompt (budget) or a three shot prompt (goals). Only the trained FinPlan-1 model was able to provide the desired format for the excel file
107
  for the budget task while both Fino-1 8B and FinPlan-1 performed well on the goals dataset. For measurement of generalizability and retention of reasoning skill, all four models
108
+ were benchmarked on GSM8K (grade school mathematics reasoning) as well as MMLU (general reasoning). While the domain specific LoRA tuning certainly led to a degradation in FinPlan-1's
109
  benchmark scores with respect to its underlying model Fino-1 8B, the drop in performance is rather small for MMLU and GSM8K performance remains above Llama 3.2 -3B Instruct.
110
 
111
+ ## Intended Usage
112
 
113
  As described above this model is intended to be used to assist with the creation of simple financial plans for individuals, specifically for assistance with the creation of a budget
114
+ spreadsheet for tracking expenses as well as planning for, short, medium and long term savings goals. While this model can be prompted on a wide range of other tasks, it is
115
+ not recommended to use this model for those purposes as it has been specifically fine-tuned for these two tasks and performance on tasks outside that scope could be diminished.
116
 
117
  See below for the basic code required in order to import the model from huggingface using torch. Note the tokenizer is pulled from the Fino-1 8B repository as it was not changed
118
  from the base Fino-1 8B model.
 
180
 
181
  The prompt format varies between the budget task and the goals task.
182
 
183
+ For the budget task, the following prompt method is recommended.
184
 
185
  ```{python}\n
186
  Q: I have an income of about 53255 a year and my monthly expenses include 2208 a month in rent and utilities, a 700 car payment, $300 in food, and about 205 a month in other expenses. Using python, can you create for me a budget spreadsheet and export it to excel?
187
  ```
188
 
189
+ For the goals task, I recommend using Few Shot Prompting, making use of the goals_dataset.csv file as your base and then adding your preferred prompt in the following format
190
  to the few shot examples derived from the goals dataset.
191
 
192
  ```{python}\n
 
194
  ```
195
 
196
 
197
+ I recommend the following code to set up few shot prompting for the goals task:
198
 
199
  ```{python}\n
200
 
 
312
 
313
  There are several risks and limitations of this model that are worth mentioning. First, in a handful of cases this model produced responses in which the math inherent in the
314
  savings goals responses was not correct, sometimes failing to add numbers up correctly or having slight rounding errors when dividing long term goals into monthly targets.
315
+ While it is well known that LLMs can struggle with mathematics given that their knowledge is language based and not numerically based, this can be a problem for a finance focused LLM.
316
+ I strongly recommend double checking the figures presented by this model. While this issue is sidestepped in the budget task through the use of python code to prevent math errors
317
  that safeguard is not implemented for the goals task. Further, this model should be limited in its use for out of scope tasks as the generalization benchmarks demonstrated that
318
  compared to its base model, this model exhibits decreased reasoning ability outside its domain specific task.
319
 
320
+ In order to improve this model I would recommend future model trainers and tuners focus on adjusting this model to default to producing python code for all mathematics based
321
+ prompts. Sticking with python for mathematics processing should allow the model to perform more highly on the goals task while retaining performance on the
322
 
323
  [More Information Needed]
324
 
 
355
 
356
  ## Model Card Contact
357
 
358
+ tar3kh@virginia.edu