dptrsa
/

STAR-QA

@@ -25,9 +25,9 @@ The model was evaluated on a held-out sample from the STAR-QA dataset (see below
 ## Training Data
-The model was fine-tuned from a corpus of audit, risk-management, compliance and associated regulatory documents sourced from the public internet. Documents were cleaned and chunked into 2-sentence blocks. Each block was then sent to a state-of-the-art LLM with the following prompt: "Write a question about {document_topic} for which this is the answer: {block}"
-The resulting question and its associated ground-truth answer (collectively a "pair") constitute a single training example for the fine-tuning step.
 ## Training
 The model was fine-tuned with the parameters:

 ## Training Data
+The model was fine-tuned on a corpus of audit, risk-management, compliance and associated regulatory documents sourced from the public internet. Documents were cleaned and chunked into 2-sentence blocks. Each block was then sent to a state-of-the-art LLM with the following prompt: "Write a question about {document_topic} for which this is the answer: {block}"
+The resulting question and its associated ground-truth answer (collectively a "pair") constitute a single training example for the fine-tuning step. The final model was fine-tuned on ~18K such pairs.
 ## Training
 The model was fine-tuned with the parameters: