add support for `kmfoda/booksum` #9

by pszemraj - opened

A good summarization dataset that was initially proposed by SalesForce research and added as kmfoda/booksum (thanks @kmfoda )

I trained this LongT5 model and would like to do eval (and the other models too!)

Thanks for raising the issue with this dataset @pszemraj !

I've opened an issue that you can track here (it's usually quite fast to get resolved): https://github.com/huggingface/datasets/issues/4641

@pszemraj this is now fixed and you can evaluate models on the dataset 🔥!

lewtun changed discussion status to closed

thanks so much! getting some errors trying to actually run an evaluation (for reference, want test, text='chapter', target='summary_text, the model referenced above). Some details here

HTTPError: 400 Client Error: Bad Request for url: https://api-staging.autotrain.huggingface.co/projects/create
Traceback:
File "/home/user/.local/lib/python3.8/site-packages/streamlit/scriptrunner/script_runner.py", line 554, in _run_script
    exec(code, module.__dict__)
File "app.py", line 486, in <module>
    project_json_resp = http_post(
File "/home/user/app/utils.py", line 44, in http_post
    response.raise_for_status()
File "/home/user/.local/lib/python3.8/site-packages/requests/models.py", line 1022, in raise_for_status
    raise HTTPError(http_error_msg, response=self)

afaik, the error seems to be on creating an autotrain job - where would be a good place to create an issue?

Thanks for reporting the issue @pszemraj !

I'm taking a look at what causes this and will report back when I've found a fix

lewtun changed discussion status to open

Hey @pszemraj we've deployed a fix for this issue and here's an evaluation done with the current API: https://huggingface.co/pszemraj/long-t5-tglobal-base-16384-book-summary/discussions/3/files

Feel free to close this issue if the problem is solved for you too :)

Thanks so much! I am waiting for an eval to come back for kmfoda/booksum on that model and then will close :)

Got the results! Thanks again for handling

pszemraj changed discussion status to closed

Great!

Just FYI this model has a large max input length of 4,096 tokens, so it takes a while to evaluate it on large datasets like CNN DailyMail :) This means that if you submit multiple evaluations with the same configuration, you may get duplicate Hub PRs until one of them is completed.

fair point - realized that as the results finally started coming in, my bad! if you see duplicate jobs for the same model from me in your job, feel free to kill them :)

FYI though every once in a while I will need to re-eval the same model on the same dataset because some of them are still WIP and being updated as I get more training in (sigh, these take forever to train)