Menouar commited on
Commit
2ea151c
1 Parent(s): 51589ab

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -6
README.md CHANGED
@@ -9,6 +9,10 @@ base_model: tiiuae/falcon-7b
9
  model-index:
10
  - name: falcon7b-linear-equations
11
  results: []
 
 
 
 
12
  ---
13
 
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -16,22 +20,35 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  # falcon7b-linear-equations
18
 
19
- This model is a fine-tuned version of [tiiuae/falcon-7b](https://huggingface.co/tiiuae/falcon-7b) on an unknown dataset.
20
 
21
  ## Model description
22
 
23
- More information needed
 
 
 
 
 
 
 
 
 
24
 
25
- ## Intended uses & limitations
26
 
27
- More information needed
 
28
 
29
  ## Training and evaluation data
30
 
31
- More information needed
 
 
32
 
33
  ## Training procedure
34
 
 
 
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
@@ -48,7 +65,7 @@ The following hyperparameters were used during training:
48
 
49
  ### Training results
50
 
51
-
52
 
53
  ### Framework versions
54
 
 
9
  model-index:
10
  - name: falcon7b-linear-equations
11
  results: []
12
+ datasets:
13
+ - Menouar/LinearEquations
14
+ language:
15
+ - en
16
  ---
17
 
18
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
20
 
21
  # falcon7b-linear-equations
22
 
23
+ This model is a fine-tuned version of [tiiuae/falcon-7b](https://huggingface.co/tiiuae/falcon-7b) on a simple dataset of [linear equations](https://huggingface.co/datasets/Menouar/LinearEquations).
24
 
25
  ## Model description
26
 
27
+ The objective of this model is to test Falcon7B's ability to solve mathematical linear equations after fine-tuning. The linear equations are in the form:
28
+
29
+ ```
30
+ Ay + ay + b + B = Dy + dy + c + C
31
+ ```
32
+ This model was trained using TRL, LoRA quantization, and Flash Attention.
33
+
34
+ Due to limited GPU resources, I only considered 20,000 samples for training.
35
+
36
+ For more information, check my [Notebook](https://colab.research.google.com/drive/1e8t5Cj6ZDAOc-z3bweWuBxF8mQZ9IPsH?usp=sharing).
37
 
 
38
 
39
+ ## Intended uses & limitations
40
+ The model can solve any equation of the form ```Ay + ay + b + B = Dy + dy + c + C``` with integer coefficients ranging from -10 to 10. It cannot solve linear equations which have more constants than: A, a, b, B, c, C. It also cannot solve linear equations which have constants larger than 10 or smaller than -10. These limitations are due to the nature of the samples within the dataset and the ability of Large Language Models (LLMs) to perform simple computations between numbers. The goal of this work is to demonstrate that fine-tuning an LLM on a specific dataset can yield excellent results for handling a specific task, as is the case with our new model compared to the original one.
41
 
42
  ## Training and evaluation data
43
 
44
+ I will complete the evaluation data later, but for now,
45
+ let’s show an example of a linear equation where this model finds the correct solution, unlike other models such as ChatGPT3.5, Bard, Llama 70B, and Mixtral:
46
+
47
 
48
  ## Training procedure
49
 
50
+ For more information, check my [Notebook](https://colab.research.google.com/drive/1e8t5Cj6ZDAOc-z3bweWuBxF8mQZ9IPsH?usp=sharing).
51
+
52
  ### Training hyperparameters
53
 
54
  The following hyperparameters were used during training:
 
65
 
66
  ### Training results
67
 
68
+ The training results can be found [here](https://huggingface.co/Menouar/falcon7b-linear-equations/tensorboard)
69
 
70
  ### Framework versions
71