John Locke

johnlockejrr

AI & ML interests

NLP, OCR, AI

Recent Activity

Organizations

None yet

johnlockejrr's activity

liked a Space 4 days ago
New activity in Gabriel/Qwen2-VL-2B-Instruct 7 days ago

Model inference

#1 opened 7 days ago by johnlockejrr
reacted to MohamedRashad's post with ❤️❤️ 13 days ago
New activity in MohamedRashad/arabic-small-nougat 18 days ago

Arabic Small Nougat

10
#1 opened 8 months ago by johnlockejrr
reacted to MohamedRashad's post with 🤗🚀 19 days ago
upvoted an article 24 days ago
reacted to jsulz's post with 🔥 27 days ago
view post
Post
2908
When the XetHub crew joined Hugging Face this fall, @erinys and I started brainstorming how to share our work to replace Git LFS on the Hub. Uploading and downloading large models and datasets takes precious time. That’s where our chunk-based approach comes in.

Instead of versioning files (like Git and Git LFS), we version variable-sized chunks of data. For the Hugging Face community, this means:

⏩ Only upload the chunks that changed.
🚀 Download just the updates, not the whole file.
🧠 We store your file as deduplicated chunks

In our benchmarks, we found that using CDC to store iterative model and dataset version led to transfer speedups of ~2x, but this isn’t just a performance boost. It’s a rethinking of how we manage models and datasets on the Hub.

We're planning on our new storage backend to the Hub in early 2025 - check out our blog to dive deeper, and let us know: how could this improve your workflows?

https://huggingface.co/blog/from-files-to-chunks
updated a model about 2 months ago
liked a Space about 2 months ago
liked a Space about 2 months ago