@fblgit on Hugging Face: "Over the past week, I've been putting Claude through its paces, focusing…"

fblgit

posted an update Mar 10, 2024

Post

Over the past week, I've been putting Claude through its paces, focusing primarily on productivity tasks (you know, the good old BAU – Business As Usual).

1. Python/Torch/Transformers/AI/ML
Right off the bat, I threw some complex AI/ML tasks at Claude, and I must say, it handled them with finesse. It even caught a few things that GPT missed! However, let's not get too carried away – we're not quite at the auto-code level just yet.

2. Brainstorming
This is where Claude falls a bit short. It seems to be more grounded than its competitors, which might not be ideal for generating novel ideas. If you're looking for a brainstorming partner, you might want to look elsewhere.

3. Attention
Despite the claims of super-large attention in the paper, Claude's "forgetting" mechanism seems to be more pronounced. It tends to miss entire chunks of information rather than just specific details like GPT does.

4. Following / Tasks
I hit a roadblock when Claude couldn't generate a LaTeX document. It's not the best at following complex, multi-step tasks.

5. Hallucinations
Oh boy, does Claude hallucinate! And when it does, it's on a whole new level of nonsense. The hallucinations seem to align with its grounded nature, making them even more convincing within the context of the prompt.

6. Sycophancy
Claude is quite the people-pleaser. I've found that using an adversarial brainstorming approach is more beneficial and time-efficient, as it forces me to highlight Claude's mistakes rather than letting it focus on being a sweet, pleasant minion.

7. Interface / UI
There's definitely room for improvement here. Basic features like stepping back on a prompt and stopping generation with the ESC key are missing. These are essential for extracting and composing content effectively.

Despite these limitations, I firmly believe that Claude is currently the #1

jukkkk3n

Mar 10, 2024

привет

nanakiseto

Jul 1, 2024

I have been leveraging several ais including calude to create from scratch a custom ai. Using prompts to get them to generate the full code. I had a issue with retrieving information stored in a json database. I am only a few weeks into coding anything beyond html js and vrml php etc. So i could not figure it out it would list the results but kick out an error and the request remade in that etc.
I gave claude my code the error and said examine the code and figure out wat the heck is wrong and generate full code for the full file 30 second later fixed. The issue was one that kept coming up in refining the code. Would get it killed and fixed and wham it would come back now it is gone and has stayed gone. Claude is a python coding beast.
So i disagree with what you call autocode generating large sections of working code or even full code claude can and does do a great job.
As far as any current gen ai on the web claude it the absolute number 1 out there.
He also is pretty good at theorizing at least so far as i have found

fblgit

Jul 8, 2024

latest 3.5 version of Claude model is even more impressive.. like SEVERAL problems (AI/ML) basically torch, where GPT4o fails epically.. were solved by Claude in 0-Shot.
But also to be said, GPT4o is very impressive using its sandbox.. kudos to that!

Join the conversation