2 14 51

Ivan Vykopal

ivykopal

ivanvykopal

AI & ML interests

NLP, Computer Vision

Recent Activity

upvoted a collection about 12 hours ago

Llama 4

upvoted a paper 6 days ago

BiblioPage: A Dataset of Scanned Title Pages for Bibliographic Metadata Extraction

upvoted a paper 6 days ago

AnnoPage Dataset: Dataset of Non-Textual Elements in Documents with Fine-Grained Categorization

View all activity

Organizations

ivykopal's activity

upvoted a collection about 12 hours ago

Llama 4

Collection

Llama 4 release • 10 items • Updated about 13 hours ago • 274

upvoted 3 papers 6 days ago

BiblioPage: A Dataset of Scanned Title Pages for Bibliographic Metadata Extraction

Paper • 2503.19658 • Published 12 days ago • 2

AnnoPage Dataset: Dataset of Non-Textual Elements in Documents with Fine-Grained Categorization

Paper • 2503.22526 • Published 9 days ago • 2

TextBite: A Historical Czech Document Dataset for Logical Page Segmentation

Paper • 2503.16664 • Published 17 days ago • 2

liked a model 10 days ago

AtlaAI/Selene-1-Mini-Llama-3.1-8B

Text Generation • Updated Feb 17 • 9.19k • 80

liked a model about 2 months ago

perplexity-ai/r1-1776

Text Generation • Updated Feb 26 • 35.6k • • 2.21k

upvoted a paper about 2 months ago

Small Models, Big Impact: Efficient Corpus and Graph-Based Adaptation of Small Multilingual Language Models for Low-Resource Languages

Paper • 2502.10140 • Published Feb 14 • 9

updated a dataset 3 months ago

ivykopal/fineweb2-slovak

Viewer • Updated Jan 18 • 26.5M • 168 • 1

New activity in ivykopal/fineweb2-slovak 3 months ago

Librarian Bot: Add language metadata for dataset

#2 opened 3 months ago by

librarian-bot

published a dataset 3 months ago

ivykopal/fineweb2-slovak

Viewer • Updated Jan 18 • 26.5M • 168 • 1

updated a model 3 months ago

ivykopal/slovak-tokenizer

Updated Jan 12 • 5

liked a model 3 months ago

microsoft/phi-4

Text Generation • Updated Feb 24 • 577k • • 1.97k

liked a Space 3 months ago

451

2024 AI Timeline

📈

View and filter AI model releases in 2024

liked 2 models 3 months ago

Snowflake/snowflake-arctic-embed-l-v2.0

deepseek-ai/DeepSeek-V3-Base

Updated 10 days ago • 27.1k • 1.62k

reacted to davanstrien's post with ❤️ 3 months ago

Post

3264

🇸🇰 Hovorte po slovensky? Help build better AI for Slovak!

We only need 90 more annotations to include Slovak in the next Hugging Face FineWeb2-C dataset ( data-is-better-together/fineweb-c) release!

Your contribution will help create better language models for 5+ million Slovak speakers.

Annotate here: data-is-better-together/fineweb-c.

Read more about why we're doing it: https://huggingface.co/blog/davanstrien/fineweb2-community

3 replies

liked a dataset 3 months ago

HuggingFaceFW/fineweb-2

Viewer • Updated Jan 8 • 12.5B • 46.8k • 454

liked a model 3 months ago

answerdotai/ModernBERT-base

Fill-Mask • Updated Jan 15 • 3.26M • 817

upvoted a paper 4 months ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 363