Update README.md
Browse files
README.md
CHANGED
@@ -6,6 +6,8 @@ datasets:
|
|
6 |
|
7 |
## Flan-UL2-Dolly - Building a commercially viable LLM
|
8 |
|
|
|
|
|
9 |
This [Github repository](https://github.com/ConiferLabsWA/flan-ul2-dolly) contains code for leveraging the [Dolly 15K](https://github.com/databrickslabs/dolly/tree/master/data) dataset [released by Databricks](https://github.com/databrickslabs/dolly/tree/master/data) to fine tune the [Flan-UL2](https://huggingface.co/google/flan-ul2) model, leveraging recent advances in instruction tuning. Flan-UL2 has been shown to outperform Flan-T5 XXL on a number of metrics and has a 4x improvement in receptive field (2048 vs 512 tokens). Additionally, both the Flan-UL2 model and the Dolly 15K dataset have the significant advantage of a commercially viable license.
|
10 |
|
11 |
### Resource Considerations
|
|
|
6 |
|
7 |
## Flan-UL2-Dolly - Building a commercially viable LLM
|
8 |
|
9 |
+
Model weights are outputs from epoch 1.
|
10 |
+
|
11 |
This [Github repository](https://github.com/ConiferLabsWA/flan-ul2-dolly) contains code for leveraging the [Dolly 15K](https://github.com/databrickslabs/dolly/tree/master/data) dataset [released by Databricks](https://github.com/databrickslabs/dolly/tree/master/data) to fine tune the [Flan-UL2](https://huggingface.co/google/flan-ul2) model, leveraging recent advances in instruction tuning. Flan-UL2 has been shown to outperform Flan-T5 XXL on a number of metrics and has a 4x improvement in receptive field (2048 vs 512 tokens). Additionally, both the Flan-UL2 model and the Dolly 15K dataset have the significant advantage of a commercially viable license.
|
12 |
|
13 |
### Resource Considerations
|