matiusX commited on
Commit
78f5a6a
1 Parent(s): 32291af

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -20
README.md CHANGED
@@ -21,7 +21,7 @@ which includes various resolutions and norms provided by UFAM.
21
 
22
  - **Developed by:** Matheus dos Santos Palheta
23
  - **Model type:** More Information Needed
24
- - **Language(s) (NLP):** Portuguese, english
25
  - **License:** MIT
26
  - **Finetuned from model:** unsloth/llama-3-8b-bnb-4bit
27
 
@@ -30,8 +30,7 @@ which includes various resolutions and norms provided by UFAM.
30
  <!-- Provide the basic links for the model. -->
31
 
32
  - **Repository:** [More Information Needed]
33
- - **Paper [optional]:** [More Information Needed]
34
- - **Demo [optional]:** [More Information Needed]
35
 
36
  ## Uses
37
 
@@ -40,7 +39,7 @@ This model is intended for use by anyone with questions about UFAM's legislation
40
 
41
  This model can be directly used to answer questions regarding UFAM's academic legislation without additional fine-tuning.
42
 
43
- ### Downstream Use [optional]
44
 
45
  The model can be integrated into larger ecosystems or applications, particularly those focusing on academic information systems,
46
  legal information retrieval, or automated student support systems from UFAM.
@@ -55,10 +54,6 @@ It should not be used for legal advice or any critical decision-making processes
55
  While the model has been fine-tuned for accuracy in the context of UFAM's legislation, it may still exhibit biases present in the training data.
56
  Additionally, the model's performance is constrained by the quality and comprehensiveness of the synthetic dataset generated.
57
 
58
- ### Recommendations
59
-
60
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
61
-
62
  ## How to Get Started with the Model
63
 
64
  Use the code below to get started with the model.
@@ -116,19 +111,12 @@ _ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)
116
 
117
  ### Training Data
118
 
119
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
120
-
121
- [More Information Needed]
122
 
123
  ### Training Procedure
124
 
125
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
126
-
127
- #### Preprocessing [optional]
128
-
129
- [More Information Needed]
130
-
131
-
132
  #### Training Hyperparameters
133
 
134
  - **Training regime:** Mixed precision (fp16)
@@ -139,9 +127,15 @@ _ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)
139
 
140
  #### Speeds, Sizes, Times [optional]
141
 
142
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
 
 
 
 
 
 
143
 
144
- [More Information Needed]
145
 
146
  ## Evaluation
147
 
 
21
 
22
  - **Developed by:** Matheus dos Santos Palheta
23
  - **Model type:** More Information Needed
24
+ - **Language(s) (NLP):** Portuguese, English
25
  - **License:** MIT
26
  - **Finetuned from model:** unsloth/llama-3-8b-bnb-4bit
27
 
 
30
  <!-- Provide the basic links for the model. -->
31
 
32
  - **Repository:** [More Information Needed]
33
+
 
34
 
35
  ## Uses
36
 
 
39
 
40
  This model can be directly used to answer questions regarding UFAM's academic legislation without additional fine-tuning.
41
 
42
+ ### Downstream Use
43
 
44
  The model can be integrated into larger ecosystems or applications, particularly those focusing on academic information systems,
45
  legal information retrieval, or automated student support systems from UFAM.
 
54
  While the model has been fine-tuned for accuracy in the context of UFAM's legislation, it may still exhibit biases present in the training data.
55
  Additionally, the model's performance is constrained by the quality and comprehensiveness of the synthetic dataset generated.
56
 
 
 
 
 
57
  ## How to Get Started with the Model
58
 
59
  Use the code below to get started with the model.
 
111
 
112
  ### Training Data
113
 
114
+ The training data for this model is based on the academic legislation of UFAM. It includes a wide range of documents,
115
+ such as resolutions and norms, which have been pre-processed and structured to create a synthetic dataset of questions and answers.
116
+ For more details on the dataset, including the pre-processing and filtering steps, please refer to the Dataset Card available [here](https://huggingface.co/datasets/matiusX/legislacao-ufam).
117
 
118
  ### Training Procedure
119
 
 
 
 
 
 
 
 
120
  #### Training Hyperparameters
121
 
122
  - **Training regime:** Mixed precision (fp16)
 
127
 
128
  #### Speeds, Sizes, Times [optional]
129
 
130
+ - **Global Step:** 60
131
+ - **Metrics:**
132
+ - **Train Runtime:** 1206.8508 seconds
133
+ - **Train Samples per Second:** 0.398
134
+ - **Train Steps per Second:** 0.05
135
+ - **Total FLOPs:** 4.451323701362688e+16
136
+ - **Train Loss:** 0.9744117197891077
137
 
138
+ ![Alt Text](output.png)
139
 
140
  ## Evaluation
141