FemkeBakker commited on
Commit
d7c4fa2
·
verified ·
1 Parent(s): 67e3351

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -9
README.md CHANGED
@@ -15,24 +15,24 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  # AmsterdamDocClassificationLlama200T2Epochs
17
 
18
- This model is a fine-tuned version of [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) on the [AmsterdamDocClassification](https://huggingface.co/datasets/FemkeBakker/AmsterdamBalancedFirst200Tokens) dataset.
 
 
 
 
19
  It achieves the following results on the evaluation set:
20
  - Loss: 0.8173
21
 
22
- ## Model description
23
-
24
- More information needed
25
-
26
- ## Intended uses & limitations
27
-
28
- More information needed
29
 
30
  ## Training and evaluation data
31
 
32
- More information needed
 
33
 
34
  ## Training procedure
35
 
 
 
36
  ### Training hyperparameters
37
 
38
  The following hyperparameters were used during training:
@@ -62,6 +62,7 @@ The following hyperparameters were used during training:
62
  | 0.7233 | 1.7903 | 1107 | 0.8178 |
63
  | 0.8389 | 1.9891 | 1230 | 0.8173 |
64
 
 
65
 
66
  ### Framework versions
67
 
@@ -69,3 +70,6 @@ The following hyperparameters were used during training:
69
  - Pytorch 2.3.0+cu121
70
  - Datasets 2.19.1
71
  - Tokenizers 0.19.1
 
 
 
 
15
 
16
  # AmsterdamDocClassificationLlama200T2Epochs
17
 
18
+ As part of the Assessing Large Language Models for Document Classification project by the Municipality of Amsterdam, we fine-tune Mistral, Llama, and GEITje for document classification.
19
+ The fine-tuning is performed using the [AmsterdamBalancedFirst200Tokens](https://huggingface.co/datasets/FemkeBakker/AmsterdamBalancedFirst200Tokens) dataset, which consists of documents truncated to the first 200 tokens.
20
+ In our research, we evaluate the fine-tuning of these LLMs across one, two, and three epochs.
21
+ This model is a fine-tuned version of [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) and has been fine-tuned for two epochs.
22
+
23
  It achieves the following results on the evaluation set:
24
  - Loss: 0.8173
25
 
 
 
 
 
 
 
 
26
 
27
  ## Training and evaluation data
28
 
29
+ - The training data consists of 9900 documents and their labels formatted into conversations.
30
+ - The evaluation data consists of 1100 documents and their labels formatted into conversations.
31
 
32
  ## Training procedure
33
 
34
+ See the [GitHub](https://github.com/Amsterdam-Internships/document-classification-using-large-language-models) for specifics about the training and the code.
35
+
36
  ### Training hyperparameters
37
 
38
  The following hyperparameters were used during training:
 
62
  | 0.7233 | 1.7903 | 1107 | 0.8178 |
63
  | 0.8389 | 1.9891 | 1230 | 0.8173 |
64
 
65
+ Training time: it took 80 minutes to fine-tune the model for two epochs.
66
 
67
  ### Framework versions
68
 
 
70
  - Pytorch 2.3.0+cu121
71
  - Datasets 2.19.1
72
  - Tokenizers 0.19.1
73
+
74
+ ### Acknowledgements
75
+ This model was trained as part of [insert thesis info] in collaboration with Amsterdam Intelligence for the City of Amsterdam.