Hi, I'm trying to repro your accuracy results. Where we can find the splits on which to evaluate? Some of the datasets only have training splits.
and also interested in the conversation format.
Β· Sign up or log in to comment