FemkeBakker
/

AmsterdamDocClassificationLlama200T2Epochs

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

FemkeBakker commited on Jul 12, 2024

Commit

d7c4fa2

·

verified ·

1 Parent(s): 67e3351

Update README.md

Files changed (1) hide show

README.md +13 -9

README.md CHANGED Viewed

@@ -15,24 +15,24 @@ should probably proofread and complete it, then remove this comment. -->
 # AmsterdamDocClassificationLlama200T2Epochs
-This model is a fine-tuned version of [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) on the [AmsterdamDocClassification](https://huggingface.co/datasets/FemkeBakker/AmsterdamBalancedFirst200Tokens) dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.8173
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -62,6 +62,7 @@ The following hyperparameters were used during training:
 | 0.7233        | 1.7903 | 1107 | 0.8178          |
 | 0.8389        | 1.9891 | 1230 | 0.8173          |
 ### Framework versions
@@ -69,3 +70,6 @@ The following hyperparameters were used during training:
 - Pytorch 2.3.0+cu121
 - Datasets 2.19.1
 - Tokenizers 0.19.1

 # AmsterdamDocClassificationLlama200T2Epochs
+As part of the Assessing Large Language Models for Document Classification project by the Municipality of Amsterdam, we fine-tune Mistral, Llama, and GEITje for document classification.
+The fine-tuning is performed using the [AmsterdamBalancedFirst200Tokens](https://huggingface.co/datasets/FemkeBakker/AmsterdamBalancedFirst200Tokens) dataset, which consists of documents truncated to the first 200 tokens.
+In our research, we evaluate the fine-tuning of these LLMs across one, two, and three epochs.
+This model is a fine-tuned version of [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) and has been fine-tuned for two epochs.
 It achieves the following results on the evaluation set:
 - Loss: 0.8173
 ## Training and evaluation data
+- The training data consists of 9900 documents and their labels formatted into conversations.
+- The evaluation data consists of 1100 documents and their labels formatted into conversations.
 ## Training procedure
+See the [GitHub](https://github.com/Amsterdam-Internships/document-classification-using-large-language-models) for specifics about the training and the code.
 ### Training hyperparameters
 The following hyperparameters were used during training:
 | 0.7233        | 1.7903 | 1107 | 0.8178          |
 | 0.8389        | 1.9891 | 1230 | 0.8173          |
+Training time: it took 80 minutes to fine-tune the model for two epochs.
 ### Framework versions
 - Pytorch 2.3.0+cu121
 - Datasets 2.19.1
 - Tokenizers 0.19.1
+### Acknowledgements
+This model was trained as part of [insert thesis info] in collaboration with Amsterdam Intelligence for the City of Amsterdam.