ashishtanwer
's Collections
Dataset
updated
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora
with Web Data, and Web Data Only
Paper
•
2306.01116
•
Published
•
33
Viewer
•
Updated
•
48.6B
•
445k
•
1.83k
Viewer
•
Updated
•
968M
•
19.3k
•
829
Preview
•
Updated
•
50.2k
•
449
LLaMA: Open and Efficient Foundation Language Models
Paper
•
2302.13971
•
Published
•
14
mosaicml/mpt-7b
Text Generation
•
Updated
•
31.2k
•
1.16k
togethercomputer/RedPajama-Data-V2
Updated
•
3.23k
•
359
stepfun-ai/GOT-OCR2_0
Image-Text-to-Text
•
Updated
•
630k
•
1.35k
Focus Anywhere for Fine-grained Multi-page Document Understanding
Paper
•
2405.14295
•
Published
•
1
Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models
Paper
•
2312.06109
•
Published
•
21
💬
GOT Online
llava-hf/llava-1.5-7b-hf
Image-Text-to-Text
•
Updated
•
755k
•
224
microsoft/OmniParser
Image-Text-to-Text
•
Updated
•
1.64k
•
1.54k
ColPali: Efficient Document Retrieval with Vision Language Models
Paper
•
2407.01449
•
Published
•
43
InternLM-XComposer-2.5: A Versatile Large Vision Language Model
Supporting Long-Contextual Input and Output
Paper
•
2407.03320
•
Published
•
93
Viewer
•
Updated
•
1.45M
•
17.5k
•
176