alimosavian commited on
Commit
1097ce5
1 Parent(s): 1ca8c20

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -5
README.md CHANGED
@@ -67,7 +67,7 @@ r = pipe(
67
  messages,
68
  max_length=4096,
69
  do_sample=False,
70
- eos_token_id=tokenizer.vocab['<end_of_turn>']
71
  )
72
  ```
73
 
@@ -77,17 +77,20 @@ r = pipe(
77
 
78
  ### Training Data
79
 
80
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
 
 
 
 
81
 
82
- [More Information Needed]
83
 
84
  ### Training Procedure
 
85
 
86
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
 
88
  #### Preprocessing [optional]
89
 
90
- [More Information Needed]
91
 
92
 
93
  #### Training Hyperparameters
 
67
  messages,
68
  max_length=4096,
69
  do_sample=False,
70
+ eos_token_id=[tokenizer.vocab['<end_of_turn>'], tokenizer.eos_token_id],
71
  )
72
  ```
73
 
 
77
 
78
  ### Training Data
79
 
80
+ The model has been on a proprietary dataset of ~1.35M examples consisting of
81
+ * High quality swedish instruct data
82
+ * Single turn
83
+ * Multi-turn
84
+ * High quality swe <-> eng translations
85
 
 
86
 
87
  ### Training Procedure
88
+ For training we used hugginface Accelerate and TRL.
89
 
 
90
 
91
  #### Preprocessing [optional]
92
 
93
+ For efficiency, we packed all the examples into 8K context windows, reducing the number examples to ~12% of their original count.
94
 
95
 
96
  #### Training Hyperparameters