Separate training data by country

#117
by wponhf - opened

Greetings, I sent a question to bigscience-contact@googlegroups.com, but have not received a response. If I am asking this question in the wrong forum, I apologize. Are there any resources available to to understand how to isolate or categorize the English-sourced training data according to its country of origin? Thanks.

BigScience Workshop org
edited Sep 26, 2022

You can find this information (when available) in the data card deck available here, under Speaker Locations:
Data Cards per Source

Sign up or log in to comment