BramVanroy commited on
Commit
5c4ece7
1 Parent(s): 8628592

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -13
README.md CHANGED
@@ -1,36 +1,39 @@
1
  ---
2
- tags:
3
- - generated_from_trainer
4
  datasets:
5
  - BramVanroy/alpaca-dolly-dutch
 
 
 
6
  model-index:
7
- - name: 1e-3lr+512tbs@4n4gpus80
8
  results: []
9
  ---
10
 
11
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
- should probably proofread and complete it, then remove this comment. -->
13
 
14
- # 1e-3lr+512tbs@4n4gpus80
15
-
16
- This model is a fine-tuned version of [ybelkada/falcon-7b-sharded-bf16](https://huggingface.co/ybelkada/falcon-7b-sharded-bf16) on the BramVanroy/alpaca-dolly-dutch dataset.
17
- It achieves the following results on the evaluation set:
18
- - Loss: 1.2612
19
 
20
  ## Model description
21
 
22
- More information needed
 
 
23
 
24
  ## Intended uses & limitations
25
 
26
- More information needed
 
 
27
 
28
  ## Training and evaluation data
29
 
30
- More information needed
 
31
 
32
  ## Training procedure
33
 
 
 
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
 
1
  ---
2
+ license: cc-by-nc-4.0
 
3
  datasets:
4
  - BramVanroy/alpaca-dolly-dutch
5
+ language:
6
+ - nl
7
+ inference: false
8
  model-index:
9
+ - name: falcon-7b-ft-alpaca-cleaned-dutch
10
  results: []
11
  ---
12
 
 
 
13
 
14
+ # falcon-7b-ft-alpaca-dolly-dutch
 
 
 
 
15
 
16
  ## Model description
17
 
18
+ This model is a fine-tuned version of [ybelkada/falcon-7b-sharded-bf16](https://huggingface.co/ybelkada/falcon-7b-sharded-bf16) on the [BramVanroy/alpaca-dolly-dutch](https://huggingface.co/datasets/BramVanroy/alpaca-dolly-dutch) dataset.
19
+ See the original [Falcon 7B model](https://huggingface.co/tiiuae/falcon-7b/) for more information, intended use, and biases.
20
+
21
 
22
  ## Intended uses & limitations
23
 
24
+ This model is intended as a (poor) baseline for Dutch generative LLMs. It by no means aims to provide SOTA performance and is specifically intended for research purposes, and an opportunity for me to test hyperparameters and stability.
25
+
26
+ Importantly, the original Falcon 7B model was only trained on English and French. Therefore, Dutch generations should be taken with a massive grain of salt.
27
 
28
  ## Training and evaluation data
29
 
30
+ Trained on the synthetic [BramVanroy/alpaca-dolly-dutch](https://huggingface.co/datasets/BramVanroy/alpaca-dolly-dutch) instruction dataset.
31
+ Therefore, commercial use of this model is forbidden. The model is intended for research purposes only.
32
 
33
  ## Training procedure
34
 
35
+ Trained with LoRA and merged before upload. The adapters are in the `adapters` branch.
36
+
37
  ### Training hyperparameters
38
 
39
  The following hyperparameters were used during training: