DeathReaper0965
commited on
Commit
·
27eb9d7
1
Parent(s):
786454c
Update README.md
Browse files
README.md
CHANGED
@@ -43,10 +43,11 @@ inference:
|
|
43 |
---
|
44 |
|
45 |
# Flan-T5 (base-sized) Dialogue Summarization with reduced toxicity using RLAIF
|
46 |
-
This model is a fine-tuned [Flan-T5 model](https://huggingface.co/google/flan-t5-base) on the [SAMSUM](https://huggingface.co/datasets/samsum) dataset.
|
47 |
-
|
|
|
48 |
|
49 |
-
|
50 |
|
51 |
## Model description
|
52 |
This Model has the same architecture and Parameters as its base model. Please refer to this [link](https://arxiv.org/abs/2210.11416) to know more about the model details.
|
|
|
43 |
---
|
44 |
|
45 |
# Flan-T5 (base-sized) Dialogue Summarization with reduced toxicity using RLAIF
|
46 |
+
This model is a fine-tuned [Flan-T5 model](https://huggingface.co/google/flan-t5-base) on the [SAMSUM](https://huggingface.co/datasets/samsum) dataset. <br>
|
47 |
+
Which is further fine-tuned using Reinforcement Learning from AI Feedback(RLAIF). <br>
|
48 |
+
Anthropic's Costitutional AI [paper](https://arxiv.org/abs/2212.08073) from 2022, provides some amazing insights into how RLAIF can be leveraged. Do check out if interested!<br>
|
49 |
|
50 |
+
More, specifically I've fine-tuned this model on a single downstream task of Dialogue Summarization on the above mentioned dataset with a primary objective of reduced toxicity in generated summaries.
|
51 |
|
52 |
## Model description
|
53 |
This Model has the same architecture and Parameters as its base model. Please refer to this [link](https://arxiv.org/abs/2210.11416) to know more about the model details.
|