kevin510 commited on
Commit
c9d1e15
1 Parent(s): 69b26de

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -6,6 +6,8 @@ datasets:
6
 
7
  ## Flan-UL2-Dolly - Building a commercially viable LLM
8
 
 
 
9
  This [Github repository](https://github.com/ConiferLabsWA/flan-ul2-dolly) contains code for leveraging the [Dolly 15K](https://github.com/databrickslabs/dolly/tree/master/data) dataset [released by Databricks](https://github.com/databrickslabs/dolly/tree/master/data) to fine tune the [Flan-UL2](https://huggingface.co/google/flan-ul2) model, leveraging recent advances in instruction tuning. Flan-UL2 has been shown to outperform Flan-T5 XXL on a number of metrics and has a 4x improvement in receptive field (2048 vs 512 tokens). Additionally, both the Flan-UL2 model and the Dolly 15K dataset have the significant advantage of a commercially viable license.
10
 
11
  ### Resource Considerations
 
6
 
7
  ## Flan-UL2-Dolly - Building a commercially viable LLM
8
 
9
+ Model weights are outputs from epoch 1.
10
+
11
  This [Github repository](https://github.com/ConiferLabsWA/flan-ul2-dolly) contains code for leveraging the [Dolly 15K](https://github.com/databrickslabs/dolly/tree/master/data) dataset [released by Databricks](https://github.com/databrickslabs/dolly/tree/master/data) to fine tune the [Flan-UL2](https://huggingface.co/google/flan-ul2) model, leveraging recent advances in instruction tuning. Flan-UL2 has been shown to outperform Flan-T5 XXL on a number of metrics and has a 4x improvement in receptive field (2048 vs 512 tokens). Additionally, both the Flan-UL2 model and the Dolly 15K dataset have the significant advantage of a commercially viable license.
12
 
13
  ### Resource Considerations