Dataset filtering

#29
by mchochowski - opened

Hi, there is an info for Dataset: "We used a curated, filtered selection of most of the GPT-4 augmented data from our OpenOrca dataset, which aims to reproduce the Orca Research Paper dataset."
Could author elaborate on this ? what were actual reasons some samples were removed ?

Sign up or log in to comment