matiusX
/

lamma-legis-ufam

PEFT

Safetensors

Portuguese

English

Model card Files Files and versions Community

matiusX commited on Aug 6, 2024

Commit

78f5a6a

•

1 Parent(s): 32291af

Update README.md

Browse files

Files changed (1) hide show

README.md +14 -20

README.md CHANGED Viewed

@@ -21,7 +21,7 @@ which includes various resolutions and norms provided by UFAM.
 - **Developed by:** Matheus dos Santos Palheta
 - **Model type:** More Information Needed
-- **Language(s) (NLP):** Portuguese, english
 - **License:** MIT
 - **Finetuned from model:** unsloth/llama-3-8b-bnb-4bit
@@ -30,8 +30,7 @@ which includes various resolutions and norms provided by UFAM.
 <!-- Provide the basic links for the model. -->
 - **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
 ## Uses
@@ -40,7 +39,7 @@ This model is intended for use by anyone with questions about UFAM's legislation
 This model can be directly used to answer questions regarding UFAM's academic legislation without additional fine-tuning.
-### Downstream Use [optional]
 The model can be integrated into larger ecosystems or applications, particularly those focusing on academic information systems,
 legal information retrieval, or automated student support systems from UFAM.
@@ -55,10 +54,6 @@ It should not be used for legal advice or any critical decision-making processes
 While the model has been fine-tuned for accuracy in the context of UFAM's legislation, it may still exhibit biases present in the training data.
 Additionally, the model's performance is constrained by the quality and comprehensiveness of the synthetic dataset generated.
-### Recommendations
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 ## How to Get Started with the Model
 Use the code below to get started with the model.
@@ -116,19 +111,12 @@ _ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)
 ### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
 ### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
 #### Training Hyperparameters
 - **Training regime:** Mixed precision (fp16)
@@ -139,9 +127,15 @@ _ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)
 #### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
 ## Evaluation

 - **Developed by:** Matheus dos Santos Palheta
 - **Model type:** More Information Needed
+- **Language(s) (NLP):** Portuguese, English
 - **License:** MIT
 - **Finetuned from model:** unsloth/llama-3-8b-bnb-4bit
 <!-- Provide the basic links for the model. -->
 - **Repository:** [More Information Needed]
 ## Uses
 This model can be directly used to answer questions regarding UFAM's academic legislation without additional fine-tuning.
+### Downstream Use
 The model can be integrated into larger ecosystems or applications, particularly those focusing on academic information systems,
 legal information retrieval, or automated student support systems from UFAM.
 While the model has been fine-tuned for accuracy in the context of UFAM's legislation, it may still exhibit biases present in the training data.
 Additionally, the model's performance is constrained by the quality and comprehensiveness of the synthetic dataset generated.
 ## How to Get Started with the Model
 Use the code below to get started with the model.
 ### Training Data
+The training data for this model is based on the academic legislation of UFAM. It includes a wide range of documents,
+such as resolutions and norms, which have been pre-processed and structured to create a synthetic dataset of questions and answers.
+For more details on the dataset, including the pre-processing and filtering steps, please refer to the Dataset Card available [here](https://huggingface.co/datasets/matiusX/legislacao-ufam).
 ### Training Procedure
 #### Training Hyperparameters
 - **Training regime:** Mixed precision (fp16)
 #### Speeds, Sizes, Times [optional]
+- **Global Step:** 60
+- **Metrics:**
+  - **Train Runtime:** 1206.8508 seconds
+  - **Train Samples per Second:** 0.398
+  - **Train Steps per Second:** 0.05
+  - **Total FLOPs:** 4.451323701362688e+16
+  - **Train Loss:** 0.9744117197891077
+![Alt Text](output.png)
 ## Evaluation