Update README.md
Browse files
README.md
CHANGED
@@ -21,7 +21,7 @@ which includes various resolutions and norms provided by UFAM.
|
|
21 |
|
22 |
- **Developed by:** Matheus dos Santos Palheta
|
23 |
- **Model type:** More Information Needed
|
24 |
-
- **Language(s) (NLP):** Portuguese,
|
25 |
- **License:** MIT
|
26 |
- **Finetuned from model:** unsloth/llama-3-8b-bnb-4bit
|
27 |
|
@@ -30,8 +30,7 @@ which includes various resolutions and norms provided by UFAM.
|
|
30 |
<!-- Provide the basic links for the model. -->
|
31 |
|
32 |
- **Repository:** [More Information Needed]
|
33 |
-
|
34 |
-
- **Demo [optional]:** [More Information Needed]
|
35 |
|
36 |
## Uses
|
37 |
|
@@ -40,7 +39,7 @@ This model is intended for use by anyone with questions about UFAM's legislation
|
|
40 |
|
41 |
This model can be directly used to answer questions regarding UFAM's academic legislation without additional fine-tuning.
|
42 |
|
43 |
-
### Downstream Use
|
44 |
|
45 |
The model can be integrated into larger ecosystems or applications, particularly those focusing on academic information systems,
|
46 |
legal information retrieval, or automated student support systems from UFAM.
|
@@ -55,10 +54,6 @@ It should not be used for legal advice or any critical decision-making processes
|
|
55 |
While the model has been fine-tuned for accuracy in the context of UFAM's legislation, it may still exhibit biases present in the training data.
|
56 |
Additionally, the model's performance is constrained by the quality and comprehensiveness of the synthetic dataset generated.
|
57 |
|
58 |
-
### Recommendations
|
59 |
-
|
60 |
-
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
|
61 |
-
|
62 |
## How to Get Started with the Model
|
63 |
|
64 |
Use the code below to get started with the model.
|
@@ -116,19 +111,12 @@ _ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)
|
|
116 |
|
117 |
### Training Data
|
118 |
|
119 |
-
|
120 |
-
|
121 |
-
|
122 |
|
123 |
### Training Procedure
|
124 |
|
125 |
-
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
126 |
-
|
127 |
-
#### Preprocessing [optional]
|
128 |
-
|
129 |
-
[More Information Needed]
|
130 |
-
|
131 |
-
|
132 |
#### Training Hyperparameters
|
133 |
|
134 |
- **Training regime:** Mixed precision (fp16)
|
@@ -139,9 +127,15 @@ _ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)
|
|
139 |
|
140 |
#### Speeds, Sizes, Times [optional]
|
141 |
|
142 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
143 |
|
144 |
-
[
|
145 |
|
146 |
## Evaluation
|
147 |
|
|
|
21 |
|
22 |
- **Developed by:** Matheus dos Santos Palheta
|
23 |
- **Model type:** More Information Needed
|
24 |
+
- **Language(s) (NLP):** Portuguese, English
|
25 |
- **License:** MIT
|
26 |
- **Finetuned from model:** unsloth/llama-3-8b-bnb-4bit
|
27 |
|
|
|
30 |
<!-- Provide the basic links for the model. -->
|
31 |
|
32 |
- **Repository:** [More Information Needed]
|
33 |
+
|
|
|
34 |
|
35 |
## Uses
|
36 |
|
|
|
39 |
|
40 |
This model can be directly used to answer questions regarding UFAM's academic legislation without additional fine-tuning.
|
41 |
|
42 |
+
### Downstream Use
|
43 |
|
44 |
The model can be integrated into larger ecosystems or applications, particularly those focusing on academic information systems,
|
45 |
legal information retrieval, or automated student support systems from UFAM.
|
|
|
54 |
While the model has been fine-tuned for accuracy in the context of UFAM's legislation, it may still exhibit biases present in the training data.
|
55 |
Additionally, the model's performance is constrained by the quality and comprehensiveness of the synthetic dataset generated.
|
56 |
|
|
|
|
|
|
|
|
|
57 |
## How to Get Started with the Model
|
58 |
|
59 |
Use the code below to get started with the model.
|
|
|
111 |
|
112 |
### Training Data
|
113 |
|
114 |
+
The training data for this model is based on the academic legislation of UFAM. It includes a wide range of documents,
|
115 |
+
such as resolutions and norms, which have been pre-processed and structured to create a synthetic dataset of questions and answers.
|
116 |
+
For more details on the dataset, including the pre-processing and filtering steps, please refer to the Dataset Card available [here](https://huggingface.co/datasets/matiusX/legislacao-ufam).
|
117 |
|
118 |
### Training Procedure
|
119 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
120 |
#### Training Hyperparameters
|
121 |
|
122 |
- **Training regime:** Mixed precision (fp16)
|
|
|
127 |
|
128 |
#### Speeds, Sizes, Times [optional]
|
129 |
|
130 |
+
- **Global Step:** 60
|
131 |
+
- **Metrics:**
|
132 |
+
- **Train Runtime:** 1206.8508 seconds
|
133 |
+
- **Train Samples per Second:** 0.398
|
134 |
+
- **Train Steps per Second:** 0.05
|
135 |
+
- **Total FLOPs:** 4.451323701362688e+16
|
136 |
+
- **Train Loss:** 0.9744117197891077
|
137 |
|
138 |
+
![Alt Text](output.png)
|
139 |
|
140 |
## Evaluation
|
141 |
|