Oleksii Maryshchenko

omaryshchenko

AI & ML interests

None yet

Recent Activity

reacted to JingzeShi's post with 🤯 1 day ago

Only a single RTX 4090 running model pre-training is really slow, even for small language models!!! (https://huggingface.co/collections/JingzeShi/doge-slm-677fd879f8c4fd0f43e05458)

upvoted an article 4 days ago

Timm ❤️ Transformers: Use any timm model with transformers

reacted to mlabonne's post with 🤗 5 days ago

🆕 LLM Course 2025 edition! I updated the LLM Scientist roadmap and added a ton of new information and references. It covers training, datasets, evaluation, quantization, and new trends like test-time compute scaling. The LLM Course has been incredibly popular (41.3k stars!) and I've been touched to receive many, many messages about how it helped people in their careers. I know how difficult this stuff can be, so I'm super proud of the impact it had. I want to keep updating it in 2025, especially with the LLM Engineer roadmap. Thanks everyone, hope you'll enjoy it! 💻 LLM Course: https://huggingface.co/blog/mlabonne/llm-course

View all activity

Organizations

None yet

omaryshchenko's activity

reacted to JingzeShi's post with 🤯 1 day ago

Post

1543

Only a single RTX 4090 running model pre-training is really slow, even for small language models!!! ( JingzeShi/doge-slm-677fd879f8c4fd0f43e05458)

2 replies

upvoted an article 4 days ago

Article

Timm ❤️ Transformers: Use any timm model with transformers

6 days ago

• 30

reacted to mlabonne's post with 🤗 5 days ago

Post

2833

🆕 LLM Course 2025 edition!

I updated the LLM Scientist roadmap and added a ton of new information and references. It covers training, datasets, evaluation, quantization, and new trends like test-time compute scaling.

The LLM Course has been incredibly popular (41.3k stars!) and I've been touched to receive many, many messages about how it helped people in their careers.

I know how difficult this stuff can be, so I'm super proud of the impact it had. I want to keep updating it in 2025, especially with the LLM Engineer roadmap.

Thanks everyone, hope you'll enjoy it!

💻 LLM Course: https://huggingface.co/blog/mlabonne/llm-course

upvoted an article 5 days ago

Article

The Large Language Model Course

•

5 days ago

• 66

reacted to clem's post with 🚀 about 1 month ago

Post

1920

Coming back to Paris Friday to open our new Hugging Face office!

We're at capacity for the party but add your name in the waiting list as we're trying to privatize the passage du Caire for extra space for robots 🤖🦾🦿

https://t.co/enkFXjWndJ

1 reply

liked a Space about 1 month ago

Running

511

😻

Open Source Ai Year In Review 2024

What happened in open-source AI this year, and what’s next?

reacted to csabakecskemeti's post with 👍 about 1 month ago

Post

4512

The AMD Instinct MI50 (~$110) is surprisingly fast for inference Quantized models.

This runs a Llama 3.1 8B Q8 with Llama.cpp
https://huggingface.co/spaces/DevQuasar/Mi50

A little blogpost about the HW
http://devquasar.com/uncategorized/amd-radeon-instinct-mi50-cheap-inference/

reacted to merve's post with 🔥 about 2 months ago

Post

3921

Small yet mighty! 💫

We are releasing SmolVLM: a new 2B small vision language made for on-device use, fine-tunable on consumer GPU, immensely memory efficient 🤠

We release three checkpoints under Apache 2.0: SmolVLM-Instruct, SmolVLM-Synthetic and SmolVLM-Base HuggingFaceTB/smolvlm-6740bd584b2dcbf51ecb1f39

Learn more from our blog here: huggingface.co/blog/smolvlm
This release comes with a demo, fine-tuning code, MLX integration and TRL integration for DPO 💝
Try the demo: HuggingFaceTB/SmolVLM
Fine-tuning Recipe: https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb
Also TRL integration for DPO 💗

liked a Space 2 months ago

Running

1.27k

🐢

Qwen2.5 Coder Artifacts

upvoted an article 3 months ago

Article

Transformers.js v3: WebGPU support, new models & tasks, and more…

Oct 22, 2024

• 66

reacted to Xenova's post with 🚀 7 months ago

Post

6039

Florence-2, the new vision foundation model by Microsoft, can now run 100% locally in your browser on WebGPU, thanks to Transformers.js! 🤗🤯

It supports tasks like image captioning, optical character recognition, object detection, and many more! 😍 WOW!
- Demo: Xenova/florence2-webgpu
- Models: https://huggingface.co/models?library=transformers.js&other=florence2
- Source code: https://github.com/xenova/transformers.js/tree/v3/examples/florence2-webgpu

liked a model 7 months ago

speakleash/Bielik-7B-v0.1

Text Generation • Updated Oct 26, 2024 • 2.98k • 74

reacted to merve's post with 🤗 7 months ago

Post

6067

Fine-tune Florence-2 on any task 🔥

Today we release a notebook and a walkthrough blog on fine-tuning Florence-2 on DocVQA dataset @andito @SkalskiP

Blog: https://huggingface.co/blog 📕
Notebook: https://colab.research.google.com/drive/1hKDrJ5AH_o7I95PtZ9__VlCTNAo1Gjpf?usp=sharing 📖
Florence-2 is a great vision-language model thanks to it's massive dataset and small size!

This model requires conditioning through task prefixes and it's not as generalist, requiring fine-tuning on a new task, such as DocVQA 📝

We have fine-tuned the model on A100 (and one can also use a smaller GPU with smaller batch size) and saw that model picks up new tasks 🥹

See below how it looks like before and after FT 🤩
Play with the demo here andito/Florence-2-DocVQA 🏄‍♀️

reacted to merve's post with 👀 7 months ago

Post

4350

Florence-2 is a new vision foundation model capable of a wide variety of tasks 🤯
Demo 👉🏻 gokaygokay/Florence-2
Collection 👉🏻 microsoft/florence-6669f44df0d87d9c3bfb76de

This model can handle tasks that vary from OCR to semantic segmentation.

The difference from previous models is that the authors have compiled a dataset consisting of 126M images with 5.4B annotations labelled with their own data engine pseudolabelled by smaller specialized models and APIs.

The model has a similar architecture to previous models: an image encoder and a multimodality encoder with a text decoder. The authors have compiled the multitask dataset with prompts for each task.

You can also fine-tune this model on any task of choice. The authors also released different results on downstream tasks and reported their results when un/freezing the vision encoder 🤓📉
They have released fine-tuned models too, you can find them in the collection above 🤗