fblgit (FBL)

Posts 6

Post

474

Introducing UNA-ThePitbull Series

We are happy to announce the release of our latest model UNA-ThePitbull, the most powerful model below 70B in the industry. In this new generation, inspired on our previous Beagle series we curated a model that balance nicely EQ and IQ. It was trained with some of the latest datasets including:
* Replete-AI/code_bagel_hermes-2.5
* mlabonne/orpo-dpo-mix-40k
* jondurbin/py-dpo-v0.1
Available in the hub fblgit/UNA-ThePitbull-21.4B-v2 and you can grab Quant versions sponsored by @bartowski at bartowski/UNA-ThePitbull-21.4B-v2-GGUF fully compatible with Ollama, llama.cpp, etc.

UNA
In this case we tried something new by alternating uniformity across layers of both MLP & Attention reducing computational requirements while keep a high performant result.

We trained him under these terms:
* ThePitbull-v1 as base: SFT maxLR 1e-4 minLR 5e-5 for 1 Epoch
* DPO maxLR 1e-4 minLR 5e-5 for 1 Epoch
You can continue the training by merely using 5e-5 maxLR and 0 warmup steps, it should minimize catastrophic forgetting of the model.

Remember if you do so, please include a Pitbull picture on your model and cite :) Have fun!

Post

Over the past week, I've been putting Claude through its paces, focusing primarily on productivity tasks (you know, the good old BAU – Business As Usual).

1. Python/Torch/Transformers/AI/ML
Right off the bat, I threw some complex AI/ML tasks at Claude, and I must say, it handled them with finesse. It even caught a few things that GPT missed! However, let's not get too carried away – we're not quite at the auto-code level just yet.

2. Brainstorming
This is where Claude falls a bit short. It seems to be more grounded than its competitors, which might not be ideal for generating novel ideas. If you're looking for a brainstorming partner, you might want to look elsewhere.

3. Attention
Despite the claims of super-large attention in the paper, Claude's "forgetting" mechanism seems to be more pronounced. It tends to miss entire chunks of information rather than just specific details like GPT does.

4. Following / Tasks
I hit a roadblock when Claude couldn't generate a LaTeX document. It's not the best at following complex, multi-step tasks.

5. Hallucinations
Oh boy, does Claude hallucinate! And when it does, it's on a whole new level of nonsense. The hallucinations seem to align with its grounded nature, making them even more convincing within the context of the prompt.

6. Sycophancy
Claude is quite the people-pleaser. I've found that using an adversarial brainstorming approach is more beneficial and time-efficient, as it forces me to highlight Claude's mistakes rather than letting it focus on being a sweet, pleasant minion.

7. Interface / UI
There's definitely room for improvement here. Basic features like stepping back on a prompt and stopping generation with the ESC key are missing. These are essential for extracting and composing content effectively.

Despite these limitations, I firmly believe that Claude is currently the #1

View all posts

Collections 4

spaces 1

Runtime error

🐢

Fblgit Una Cybertron 7b V2 Bf16

models 22

datasets 3

fblgit/simple-math-DPO

Viewer • Updated Jan 27 • 25 • 13

fblgit/simple-math

Viewer • Updated Jan 27 • 17 • 13

fblgit/tree-of-knowledge

Updated May 24, 2023 • 9 • 18

FBL PRO

AI & ML interests

Articles

Introducing UNA-ThePitbull Series

Organizations

Posts 6

Collections 4

fblgit/juanako-7b-UNA

TheBloke/juanako-7B-UNA-GGUF

TheBloke/juanako-7B-UNA-GPTQ

bartowski/juanako-7b-UNA-exl2

fblgit/una-cybertron-7b-v3-OMA

fblgit/una-cybertron-7b-v2-bf16

fblgit/una-cybertron-7b-v1-fp16

bartowski/una-cybertron-7b-v2-exl2

spaces 1

Fblgit Una Cybertron 7b V2 Bf16

models 22

fblgit/UNA-ThePitbull-21.4B-v2

fblgit/UNA-ThePitbull-21.4-v1

fblgit/UNAversal-8x7B-v1beta

fblgit/juanako-7b-UNA

fblgit/UNA-dolphin-2.6-mistral-7b-dpo-laser

fblgit/una-cybertron-7b-v2-bf16

fblgit/UNA-POLAR-10.7B-InstructMath-v2

fblgit/LUNA-SOLARkrautLM-Instruct

fblgit/una-cybertron-7b-v1-fp16

fblgit/una-xaberius-34b-v1beta

datasets 3

fblgit/simple-math-DPO

fblgit/simple-math

fblgit/tree-of-knowledge

FBL PRO

AI & ML interests

Articles

Introducing UNA-ThePitbull Series

Organizations

Posts 6

Collections 4

spaces 1

Fblgit Una Cybertron 7b V2 Bf16

models 22 Sort: Recently updated

datasets 3 Sort: Recently updated

models 22

datasets 3