Lists of URLs from various training datasets
Nick Hagar
nhagar
AI & ML interests
digital media, collective attention, computational social science
Recent Activity
updated
a dataset
about 3 hours ago
nhagar/culturax_urls
published
a dataset
1 day ago
nhagar/culturax_urls
updated
a dataset
2 days ago
nhagar/falcon-refinedweb_urls
Organizations
models
None public yet
datasets
196
nhagar/culturax_urls
Viewer
•
Updated
•
218M
•
44
nhagar/falcon-refinedweb_urls
Viewer
•
Updated
•
968M
•
16
nhagar/CC-MAIN-2015-06_nyt_urls
Viewer
•
Updated
•
756k
•
5
nhagar/CC-MAIN-2018-05_nyt_urls
Viewer
•
Updated
•
517k
•
5
nhagar/CC-MAIN-2015-48_nyt_urls
Viewer
•
Updated
•
778k
•
5
nhagar/CC-MAIN-2015-14_nyt_urls
Viewer
•
Updated
•
531k
•
5
nhagar/CC-MAIN-2015-32_nyt_urls
Viewer
•
Updated
•
713k
•
5
nhagar/CC-MAIN-2019-51_nyt_urls
Viewer
•
Updated
•
236k
•
5
nhagar/CC-MAIN-2019-30_nyt_urls
Viewer
•
Updated
•
258k
•
5
nhagar/CC-MAIN-2016-36_nyt_urls
Viewer
•
Updated
•
1.73M
•
5