Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
onekq 
posted an update 13 days ago
Post
990
I just compared CPU vs GPU. CPU is actually good for tasks with short prompt and long answer. For such tasks, we usually treat LLM as consultant or teacher.

Say you are filing taxes and ask "what is form XXXX?" The chat bot will return an essay to explain the form and walk you through scenarios.

But when you decide to file this form, LLM becomes your assistant/agent. Suddenly the prompt becomes (much) longer than the answer. You throw in bunch of documents, and ask the LLM to fill out the form for you.

This is when we need GPU. I will get into details in the next post.

I think it depends on your hardware and the model architecture. F.e., I got an undervolted 4060 and a Ryzen 5 (six cores, each 4.2GHz).
Whenever I compare a model (int4 to int8, 1B to 10B) on both processors, the GPU always ends up being faster by at least 10t/s in generation and 50t/s in processing.