BramVanroy commited on
Commit
e176936
1 Parent(s): 856b940

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -2
README.md CHANGED
@@ -49,11 +49,15 @@ More information needed
49
 
50
  ## Intended uses & limitations
51
 
52
- More information needed
53
 
54
  ## Training and evaluation data
55
 
56
- More information needed
 
 
 
 
57
 
58
  ## Training procedure
59
 
 
49
 
50
  ## Intended uses & limitations
51
 
52
+ The same limitations as [phi-2](https://huggingface.co/microsoft/phi-2#limitations-of-phi-2), and LLMs in general, apply here. LLMs hallucinate, make mistakes, and should not be trusted. Use at your own risk!
53
 
54
  ## Training and evaluation data
55
 
56
+ Fietje 2B instruct was finetuned from [the base model](https://huggingface.co/BramVanroy/fietje-2b) on the following datasets. Number of training samples per dataset given in brackets, totalling 201,579.
57
+
58
+ - [BramVanroy/ultrachat_200k_dutch](https://huggingface.co/datasets/BramVanroy/ultrachat_200k_dutch): gpt-4-1106-preview; multi-turn; fully generated (192,598)
59
+ - [BramVanroy/no_robots_dutch](https://huggingface.co/datasets/BramVanroy/no_robots_dutch): gpt-4-1106-preview; prompt translate, answer generated; some items have system messages (8181)
60
+ - [BramVanroy/belebele_dutch](https://huggingface.co/datasets/BramVanroy/belebele_dutch): Dutch portion of [belebele](https://huggingface.co/datasets/facebook/belebele), formatted into SFT format (800)
61
 
62
  ## Training procedure
63