ThinkTim21
/

FinPlan-1

@@ -5,13 +5,13 @@ tags: []
 ![image](FInPlan-1_Image.png)
-The image above was created with Chat GPT 4o using a link this model repository, and a brief prompt.
 # FinPlan-1
 FinPlan-1 is a an LLM trained to assist with the creation of basic personal financial plans for individuals. This model is built off of the
-Fino1 model which is itself a version of Llama-3.1-8B-Instruct, which was CoT fine tuned to improve its finaicial reasoning ability.
@@ -28,18 +28,18 @@ The financial planning component is one area I think LLMs can be of assistance.
 on financial reasoning tasks to assist individuals with two key aspects of financial planning.
 1. Assist with the creation of a budget spreadsheet to enable individuals to keep track of their finances and understand where their money is going.
-2. Provide assistance with planning for short, medium and long term goals including breaking those goals down into monthly savings targets, and suggesting broad investment vehicles to fit each goal's timeframe.
-While current LLM's can perform these tasks to an extent, they are often inconsistent with their responce structure, can sometimes struggle with breaking down basic mathematics
-and frequently go beyond the basic tasks at hand reccomending inappropriate savings and investiment vehicles for individual savings goals. The Fino-1 8B model is certainly well
-trained for the corporate financial reasoning tasks but its reccomendations for savings and investment vehicles were often too agressive for short term goals and may reccomend
-long term savings vehicles which carry tax penalties if not used approporately. This model uses LoRA on a proceedureally generated budgeting dataset as well as few shot prompting using
-a separate dataset based around short, medium and long term goals to enchance the ability of Fino-1 8B to accomplish these tasks.
 The results of this training and prompting method are encouraging as the model consistently produces budget spreadsheets (through the generation of executable python code)
 as well as somewhat reliable savings plan assistance with the use of few shot prompting. These training methods do have an impact on this model's performance on standard
-benchmarks like gsm8k and mmlu resulting in drops in performance on both tasks compared with the base model, however this loss in generalization is made up for in the model's
-improved ability to accomplish the tasks of assisting indivudals with budgeting and fixed term savinings goals.
 - **Developed by:** Timothy Austin Rodriguez
@@ -52,25 +52,25 @@ improved ability to accomplish the tasks of assisting indivudals with budgeting
 ### Training Data
 This model is trained on a procedurally generated synthetic dataset that provides structured prompts and responses to assist the underlying Fino-1 8B model
-with creating executable python code which creates and exports budget spreadsheet to a Microsoft Excel .xlsx format. THis dataset (attached to this repository) is comprised
 of 3000 examples which were divided into a train/validation split of 2500 for training and 500 for validation. The code used to create and randomize this dataset including
 the seeds (42 for randomization, 60 for creation) can be located in the ipynb files attached to this repository. This dataset is called budget_dataset.csv
-While not used for trianing this model, a secondary dataset for the purposes of improving the model's performance on short, medium and long term goal planning was developed
 via procedural generation. This dataset was generated much like the first through random procedural generation of 3000 examples of prompts and responses. random seeds, and
 train/validation split code can be located in the same ipynb file as the budget dataset. This dataset is called goals_dataset.csv. This dataset was not used to train the final
-model due to poor performance encountered when leveraging LoRA for addtional training. The model actually performed worse when prompted with an example from the validation dataset
-after training than before training. A deeper exploration of why this occured is warrented and other training/tuning methods should be considered beyond LoRA for future enhancement
 of this model.
 ## Training Method
 The method of training/tuning for this model is the Parameter-Efficient Fine-Tuning method called Low-Rank Adaptation or LoRA. LoRA is a fine tuning approach that is well
 suited to tuning a model for domain specific tasks such as creating personal financial plans. LoRA is significantly more efficient than full fine tuning requiring fewer compute
-resources and is much more memory efficient as fewere model weights are changed. In many cases LoRA implementation yeilds results very similar to full fine tuning without the
-heavy computational expense inherent with full fine tuning. This method was chosen given the time allocated for training this model, limited compute resouces due to competing
-requests for GPU time on the University of Virginia's Rivanna High Performance Computing cluster and the desire to have similar results to full fine tuning desptie the lack of
-compute resouces required. LoRA Tuning hyperparameter values were selected through experimentation and can be found in one of the ipynb files attached to this repository and in
 the summary below.
 Hyperparameters
@@ -83,8 +83,8 @@ Tuning/Training Settings
 - epochs = 5
 Secondarily, this model makes use of Few Shot Prompting due to the aforementioned poor performance of LoRA when training on the goals dataset. It was found that few shot
-prompting improves the ability of the model to provide the desired response structure without degrading the model's performance as was noted with LoRA implementation reguardless
-of the Hyperparameters that were selected. Examples code for how to implement the appropriate few shot prompting is availabe in one of the provide ipynb files in this repository.
 ## Evaluation
@@ -98,21 +98,21 @@ of the Hyperparameters that were selected. Examples code for how to implement th
 The benchmarks chosen, GSM8K, MMLU and the two synthetic dataset examples were selected to provide a view of the performance of the model both in terms of its generalization
-ability as well as it's ability to perform the tasks it is trained to accomplish. As the underlying model that FinPlan-1 is based on, Fino-1 8B is a natural comparsion model
 to evaluate for benchmarking. Further, the Llama 3.2-3B Instruct model is a newer version of the model which underlies Fino-1 8B albeit a smaller version parameter wise. Given
-this model's rather decent performance on the financial planning tasks it serves as a good comparsion for FinPlan-1. Finally Ministral 8B instruct -2410 model is of comparable
-size parameter wise to FinPlan-1 and was originally considered as a potential base model to train for FinPlan-1, thus making it a good model for comparison.Since the tasks this model is tuned to accomplish are non standard and domain specific, the
 benchmark for these tasks comes from the validation/hold out split of the training dataset and its evaluation is somewhat subjective. For each of these models, the Budget and Goals examples were
 presented to the model in either a zero shot prompt (budget) or a three shot prompt (goals). Only the trained FinPlan-1 model was able to provide the desired format for the excel file
 for the budget task while both Fino-1 8B and FinPlan-1 performed well on the goals dataset. For measurement of generalizability and retention of reasoning skill, all four models
-were benchmarked on GSM8K (grade school mathematics reasoning) as well as MMLU (general reasoning). While the domain specific LoRA tuning certainly led to a degredation in FinPlan-1's
 benchmark scores with respect to its underlying model Fino-1 8B, the drop in performance is rather small for MMLU and GSM8K performance remains above Llama 3.2 -3B Instruct.
-## Intended Useage
 As described above this model is intended to be used to assist with the creation of simple financial plans for individuals, specifically for assistance with the creation of a budget
-spreadsheet for tracking expenseses as well as planning for, short, medium and long term savings goals. While this model can be prompted on a wide range of other tasks, it is
-not recommended to use this model for those purposes as it has been speficially fine tuned for these two tasks and perforamnce on tasks outside that scope could be diminished.
 See below for the basic code required in order to import the model from huggingface using torch. Note the tokenizer is pulled from the Fino-1 8B repository as it was not changed
 from the base Fino-1 8B model.
@@ -180,13 +180,13 @@ print(generated_text)
 The prompt format varies between the budget task and the goals task.
-For the budget task, the following prompt method is reccomended.
 ```{python}\n
 Q: I  have an income of about 53255 a year and my monthly expenses include 2208 a month in rent and utilities, a 700 car payment, $300 in food, and about 205 a month in other expenses. Using python, can you create for me a budget spreadsheet and export it to excel?
 ```
-For the goals task, I reccomend using Few Shot Prompting, making use of the goals_dataset.csv file as your base and then adding your prefered prompt in the following format
 to the few shot examples derived from the goals dataset.
 ```{python}\n
@@ -194,7 +194,7 @@ Q: My short term goal is to save for a $3357 vacation in the next year, my mediu
 ```
-I reccomend the following code to set up few shot prompting for the goals task:
 ```{python}\n
@@ -312,13 +312,13 @@ Finally, to make saving easier, I'll set up automatic transfers from my checking
 There are several risks and limitations of this model that are worth mentioning. First, in a handful of cases this model produced responses in which the math inherent in the
 savings goals responses was not correct, sometimes failing to add numbers up correctly or having slight rounding errors when dividing long term goals into monthly targets.
-While it is well known that LLMs can struggle with mathematics given that their knolwedge is language based and not numerically based, this can be a problem for a finance focused LLM.
-I strongly reccomend double checking the figures presented by this model. While this issue is sidestepped in the budget task through the use of python code to prevent math errros
 that safeguard is not implemented for the goals task. Further, this model should be limited in its use for out of scope tasks as the generalization benchmarks demonstrated that
 compared to its base model, this model exhibits decreased reasoning ability outside its domain specific task.
-In order to imprve theis model I woudl reccomend future model trainers and tuners focus on adjusting this model to default to producing python code for all mathematics based
-prompts. Sticking with python for mathematics processing shoula allow the model to perform more highly on the goals task while retaining performance on the
 [More Information Needed]
@@ -355,4 +355,4 @@ Timothy Austin Rodriguez
 ## Model Card Contact
-tar3kh@virginia.edu

 ![image](FInPlan-1_Image.png)
+Created with Chat GPT 4o using a link to this model repository, and a brief prompt.
 # FinPlan-1
 FinPlan-1 is a an LLM trained to assist with the creation of basic personal financial plans for individuals. This model is built off of the
+Fino1 model which is itself a version of Llama-3.1-8B-Instruct, which was CoT fine-tuned to improve its financial reasoning ability.
 on financial reasoning tasks to assist individuals with two key aspects of financial planning.
 1. Assist with the creation of a budget spreadsheet to enable individuals to keep track of their finances and understand where their money is going.
+2. Aid with planning for short, medium and long term goals including breaking those goals down into monthly savings targets, and suggesting broad investment vehicles to fit each goal's timeframe.
+While current LLM's can perform these tasks to an extent, they are often inconsistent with their response structure, can sometimes struggle with breaking down basic mathematics
+and frequently go beyond the basic tasks at hand recommending inappropriate savings and investment vehicles for individual savings goals. The Fino-1 8B model is certainly well
+trained for the corporate financial reasoning tasks but its recommendations for savings and investment vehicles were often too aggressive for short term goals and may recommend
+long term savings vehicles which carry tax penalties if not used appropriately. This model uses LoRA on a procedurally generated budgeting dataset as well as few shot prompting using
+a separate dataset based around short, medium and long term goals to enhance the ability of Fino-1 8B to accomplish these tasks.
 The results of this training and prompting method are encouraging as the model consistently produces budget spreadsheets (through the generation of executable python code)
 as well as somewhat reliable savings plan assistance with the use of few shot prompting. These training methods do have an impact on this model's performance on standard
+benchmarks like GSM8K and MMLU resulting in drops in performance on both tasks compared with the base model, however this loss in generalization is made up for in the model's
+improved ability to accomplish the tasks of assisting individuals with budgeting and fixed term savings goals.
 - **Developed by:** Timothy Austin Rodriguez
 ### Training Data
 This model is trained on a procedurally generated synthetic dataset that provides structured prompts and responses to assist the underlying Fino-1 8B model
+with creating executable python code which creates and exports budget spreadsheet to a Microsoft Excel .xlsx format. This dataset (attached to this repository) is comprised
 of 3000 examples which were divided into a train/validation split of 2500 for training and 500 for validation. The code used to create and randomize this dataset including
 the seeds (42 for randomization, 60 for creation) can be located in the ipynb files attached to this repository. This dataset is called budget_dataset.csv
+While not used for training this model, a secondary dataset for the purposes of improving the model's performance on short, medium and long term goal planning was developed
 via procedural generation. This dataset was generated much like the first through random procedural generation of 3000 examples of prompts and responses. random seeds, and
 train/validation split code can be located in the same ipynb file as the budget dataset. This dataset is called goals_dataset.csv. This dataset was not used to train the final
+model due to poor performance encountered when leveraging LoRA for additional training. The model actually performed worse when prompted with an example from the validation dataset
+after training than before training. A deeper exploration of why this occurred is warranted and other training/tuning methods should be considered beyond LoRA for future enhancement
 of this model.
 ## Training Method
 The method of training/tuning for this model is the Parameter-Efficient Fine-Tuning method called Low-Rank Adaptation or LoRA. LoRA is a fine tuning approach that is well
 suited to tuning a model for domain specific tasks such as creating personal financial plans. LoRA is significantly more efficient than full fine tuning requiring fewer compute
+resources and is much more memory efficient as fewer model weights are changed. In many cases LoRA implementation yields results very similar to full fine tuning without the
+heavy computational expense inherent with full fine tuning. This method was chosen given the time allocated for training this model, limited compute resources due to competing
+requests for GPU time on the University of Virginia's Rivanna High Performance Computing cluster and the desire to have similar results to full fine tuning despite the lack of
+compute resources required. LoRA Tuning hyperparameter values were selected through experimentation and can be found in one of the ipynb files attached to this repository and in
 the summary below.
 Hyperparameters
 - epochs = 5
 Secondarily, this model makes use of Few Shot Prompting due to the aforementioned poor performance of LoRA when training on the goals dataset. It was found that few shot
+prompting improves the ability of the model to provide the desired response structure without degrading the model's performance as was noted with LoRA implementation regardless
+of the Hyperparameters that were selected. Examples code for how to implement the appropriate few shot prompting is available in one of the provide ipynb files in this repository.
 ## Evaluation
 The benchmarks chosen, GSM8K, MMLU and the two synthetic dataset examples were selected to provide a view of the performance of the model both in terms of its generalization
+ability as well as it's ability to perform the tasks it is trained to accomplish. As the underlying model that FinPlan-1 is based on, Fino-1 8B is a natural comparison model
 to evaluate for benchmarking. Further, the Llama 3.2-3B Instruct model is a newer version of the model which underlies Fino-1 8B albeit a smaller version parameter wise. Given
+this model's rather decent performance on the financial planning tasks it serves as a good comparison for FinPlan-1. Finally Ministral 8B instruct -2410 model is of comparable
+size parameter wise to FinPlan-1 and was originally considered as a potential base model to train for FinPlan-1, thus making it a good model for comparison. Since the tasks this model is tuned to accomplish are non standard and domain specific, the
 benchmark for these tasks comes from the validation/hold out split of the training dataset and its evaluation is somewhat subjective. For each of these models, the Budget and Goals examples were
 presented to the model in either a zero shot prompt (budget) or a three shot prompt (goals). Only the trained FinPlan-1 model was able to provide the desired format for the excel file
 for the budget task while both Fino-1 8B and FinPlan-1 performed well on the goals dataset. For measurement of generalizability and retention of reasoning skill, all four models
+were benchmarked on GSM8K (grade school mathematics reasoning) as well as MMLU (general reasoning). While the domain specific LoRA tuning certainly led to a degradation in FinPlan-1's
 benchmark scores with respect to its underlying model Fino-1 8B, the drop in performance is rather small for MMLU and GSM8K performance remains above Llama 3.2 -3B Instruct.
+## Intended Usage
 As described above this model is intended to be used to assist with the creation of simple financial plans for individuals, specifically for assistance with the creation of a budget
+spreadsheet for tracking expenses as well as planning for, short, medium and long term savings goals. While this model can be prompted on a wide range of other tasks, it is
+not recommended to use this model for those purposes as it has been specifically fine-tuned for these two tasks and performance on tasks outside that scope could be diminished.
 See below for the basic code required in order to import the model from huggingface using torch. Note the tokenizer is pulled from the Fino-1 8B repository as it was not changed
 from the base Fino-1 8B model.
 The prompt format varies between the budget task and the goals task.
+For the budget task, the following prompt method is recommended.
 ```{python}\n
 Q: I  have an income of about 53255 a year and my monthly expenses include 2208 a month in rent and utilities, a 700 car payment, $300 in food, and about 205 a month in other expenses. Using python, can you create for me a budget spreadsheet and export it to excel?
 ```
+For the goals task, I recommend using Few Shot Prompting, making use of the goals_dataset.csv file as your base and then adding your preferred prompt in the following format
 to the few shot examples derived from the goals dataset.
 ```{python}\n
 ```
+I recommend the following code to set up few shot prompting for the goals task:
 ```{python}\n
 There are several risks and limitations of this model that are worth mentioning. First, in a handful of cases this model produced responses in which the math inherent in the
 savings goals responses was not correct, sometimes failing to add numbers up correctly or having slight rounding errors when dividing long term goals into monthly targets.
+While it is well known that LLMs can struggle with mathematics given that their knowledge is language based and not numerically based, this can be a problem for a finance focused LLM.
+I strongly recommend double checking the figures presented by this model. While this issue is sidestepped in the budget task through the use of python code to prevent math errors
 that safeguard is not implemented for the goals task. Further, this model should be limited in its use for out of scope tasks as the generalization benchmarks demonstrated that
 compared to its base model, this model exhibits decreased reasoning ability outside its domain specific task.
+In order to improve this model I would recommend future model trainers and tuners focus on adjusting this model to default to producing python code for all mathematics based
+prompts. Sticking with python for mathematics processing should allow the model to perform more highly on the goals task while retaining performance on the
 [More Information Needed]
 ## Model Card Contact
+tar3kh@virginia.edu