Training, eval suite, and model from the paper "Large Scale Transfer Learning for Tabular Data via Language Modeling" https://arxiv.org/abs/2406.12031
ML Foundations
non-profit
AI & ML interests
None defined yet.
Collections
4
Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens"
-
MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens
Paper • 2406.11271 • Published • 18 -
mlfoundations/MINT-1T-HTML
Viewer • Updated • 659M • 201k • 69 -
mlfoundations/MINT-1T-ArXiv
Viewer • Updated • 5.6M • 121 • 45 -
mlfoundations/MINT-1T-PDF-CC-2024-18
Updated • 45 • 16
models
7
mlfoundations/dclm-7b-it
Updated
•
76
•
8
mlfoundations/fasttext-oh-eli5
Updated
•
16
mlfoundations/tabula-8b
Text Generation
•
Updated
•
250
•
21
mlfoundations/dclm-it-quantized
Updated
mlfoundations/scaling
Updated
•
2
mlfoundations/open_lm_7B_1.25T
Updated
•
4
mlfoundations/open_lm_1B
Updated
•
11
datasets
35
mlfoundations/dclm-pool-7b-2x
Preview
•
Updated
•
3
mlfoundations/dclm-pool-1b-1x
Preview
•
Updated
•
71
•
1
mlfoundations/MINT-1T-HTML
Viewer
•
Updated
•
659M
•
201k
•
69
mlfoundations/MINT-1T-PDF-CC-2023-40
Updated
•
17
•
1
mlfoundations/MINT-1T-PDF-CC-2023-23
Viewer
•
Updated
•
3.26M
•
20
•
1
mlfoundations/MINT-1T-PDF-CC-2023-50
Viewer
•
Updated
•
3.21M
•
27
•
3
mlfoundations/MINT-1T-PDF-CC-2023-14
Viewer
•
Updated
•
2.38M
•
39
•
1
mlfoundations/MINT-1T-PDF-CC-2024-10
Updated
•
24
•
1
mlfoundations/MINT-1T-PDF-CC-2023-06
Updated
•
29
•
2
mlfoundations/MINT-1T-PDF-CC-2024-18
Updated
•
45
•
16