taddeusb90 commited on
Commit
c38b349
1 Parent(s): a59a617

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +129 -3
README.md CHANGED
@@ -1,3 +1,129 @@
1
- ---
2
- license: llama3
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama3
3
+ datasets:
4
+ - taddeusb90/finbro-v0.1.0
5
+ language:
6
+ - en
7
+ library_name: transformers
8
+ tags:
9
+ - finance
10
+ ---
11
+
12
+ Fibro v0.1.0 Dolphin 2.9 Llama 3 8B Model with 1m token context window
13
+ ======================
14
+
15
+ Model Description
16
+ -----------------
17
+
18
+ The Fibro Dolphin 2.9 Llama 3 8B model is a language model optimized for financial applications. This model is uncensored and aims to enhance financial analysis, automate data extraction, improve financial literacy across various user expertise levels, and is trained for obedience. It utilizes a massive 1m token context window.
19
+ This is just a sneak peek into what's coming, and future releases will be done periodically, consistently improving its performance.
20
+
21
+ ![FinBro](https://huggingface.co/taddeusb90/finbro-v0.1.0-dolphin-2.9-llama-3-8B-instruct-131k/resolve/main/1539868156729340231_3171889935_10-05-2024-05-08-05.jpeg)
22
+
23
+ Training:
24
+ -----------------
25
+
26
+ The model is still training, I will be sharing new incremental releases while it's improving so you have time to play around with it.
27
+ ![Loss](https://huggingface.co/taddeusb90/finbro-v0.1.0-llama-3-8B-instruct-1m-POSE/resolve/main/W%26B%20Chart%2006_05_2024%2C%2015_57_42.png)
28
+ ![Evaluation Loss](https://huggingface.co/taddeusb90/finbro-v0.1.0-llama-3-8B-instruct-1m-POSE/resolve/main/W%26B%20Chart%2006_05_2024%2C%2015_58_01.png)
29
+
30
+ What's Next?
31
+ -----------
32
+
33
+ * **Extended Capability:** Continue training on the 8B model as it hasn't converged yet I only scratched the surface here and transitioning to scale up with a 70B model for deeper insights and broader financial applications.
34
+ * **Dataset Expansion:** Continuous enhancement by integrating more diverse and comprehensive real and synthetic financial data.
35
+ * **Advanced Financial Analysis:** Future versions will support complex financial decision-making processes by interpreting and analyzing financial data within agentive workflows.
36
+ * **Incremental Improvements:** Regular updates are made to increase the model's efficiency and accuracy and extend its capabilities in financial tasks.
37
+
38
+ Model Applications
39
+ ------------------
40
+
41
+ * **Information Extraction:** Automates the process of extracting valuable data from unstructured financial documents.
42
+ * **Financial Literacy:** Provides explanations of financial documents at various levels, making financial knowledge more accessible.
43
+
44
+
45
+ How to Use
46
+ ----------
47
+
48
+ Here is how to load and use the model in your Python projects:
49
+
50
+
51
+ ```python
52
+ from transformers import AutoModelForCausalLM, AutoTokenizer
53
+
54
+ model_name = "taddeusb90/finbro-v0.1.0-llama-3-8B-instruct-131k"
55
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
56
+
57
+ model = AutoModelForCausalLM.from_pretrained(model_name)
58
+ text = "Your financial query here"
59
+
60
+ inputs = tokenizer(text, return_tensors="pt")
61
+
62
+ outputs = model.generate(inputs['input_ids'])
63
+
64
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
65
+ ```
66
+
67
+ Training Data
68
+ -------------
69
+
70
+ The Fibro Llama 3 8B model was trained on the Finbro Dataset, an extensive compilation of over 300,000 entries sourced from Investopedia and Sujet Finance. This dataset includes structured Q&A pairs, financial reports, and a variety of financial tasks pooled from multiple datasets.
71
+
72
+ The dataset can be found [here](https://huggingface.co/datasets/taddeusb90/finbro-v0.1.0)
73
+
74
+ This dataset will be extended to contain real and synthetic data on a wide range of financial tasks such as:
75
+ - Investment valuation
76
+ - Value investing
77
+ - Security analysis
78
+ - Derivatives
79
+ - Asset and portfolio management
80
+ - Financial information extraction
81
+ - Quantitative finance
82
+ - Econometrics
83
+ - Applied computer science in finance
84
+ and much more
85
+
86
+ Notice
87
+ --------
88
+
89
+ Please exercise caution and use it at your own risk. I assume no responsibility for any losses incurred if used.
90
+
91
+
92
+ Licensing
93
+ ---------
94
+
95
+ This model is released under the [META LLAMA 3 COMMUNITY LICENSE AGREEMENT](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/blob/main/LICENSE).
96
+
97
+
98
+ Citation
99
+ --------
100
+
101
+ If you use this model in your research, please cite it as follows:
102
+
103
+
104
+ ```bibtex
105
+ @misc{
106
+ finbro_v0.1.0-llama-3-8B-131k,
107
+ author = {Taddeus Buica},
108
+ title = {Fibro Llama 3 8B Model for Financial Analysis},
109
+ year = {2024},
110
+ journal = {Hugging Face repository},
111
+ howpublished = {\url{https://huggingface.co/taddeusb90/finbro-v0.1.0-llama-3-8B-instruct-131k}}
112
+ }
113
+ ```
114
+
115
+ Special thanks to the folks from AI@Meta for powering this project with their awesome models.
116
+
117
+ Contact
118
+ --------
119
+
120
+ If you would like to connect, share ideas, feedback, help support bigger models or even develop your own custom finance model on your private dataset let's talk on [LinkedIn](https://www.linkedin.com/in/taddeus-buica-1009a965/)
121
+
122
+ References
123
+ --------
124
+
125
+ [[1](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)] Llama 3 Model Card by AI@Meta, Year: 2024
126
+
127
+ [[2](https://huggingface.co/datasets/sujet-ai/Sujet-Finance-Instruct-177k)] Sujet Finance Dataset
128
+
129
+ [[3](https://huggingface.co/datasets/FinLang/investopedia-instruction-tuning-dataset)] Dataset Card for investopedia-instruction-tuning