Shinoji Research

community
Activity Feed

AI & ML interests

LLMs, finetuning, finance.

Recent Activity

alicecomfy  updated a model 10 months ago
ShinojiResearch/Senku-70B-Full
alicecomfy  updated a model 11 months ago
ShinojiResearch/Senku-70B
alicecomfy  updated a model 11 months ago
ShinojiResearch/Senku-70B-Q8
View all activity

ShinojiResearch's activity

fblgit 
posted an update about 2 months ago
view post
Post
1035
Introducing miniclaus 1.5B, a tiny but powerful model. Trained with MagPie and based on Qwen2.5 1.5B model, it performs very well on many tasks scoring top on his category, with impressive results:
* MATH Hard 9.81
* MMLU-Pro 29.37
* GPQA 29.19
* MUSR 42.85
* BBH 42.04

Available already in the hub:
fblgit/miniclaus-qw1.5B-UNAMGS
fblgit 
posted an update about 2 months ago
view post
Post
728
Cybertron is back:

We released today a newest version of Cybertron: V4 based on Qwen2.5 7B and trained on MagPie. Scoring #1 LLM on 7B & 8B class.

The model hasn't go thru DPO, so the weights are in good shape to welcome further training sessions and optimizations.
Enjoy it in the hub as usual:
fblgit/cybertron-v4-qw7B-MGS
  • 1 reply
·
fblgit 
posted an update 7 months ago
view post
Post
2603
Introducing UNA-ThePitbull Series

We are happy to announce the release of our latest model UNA-ThePitbull, the most powerful model below 70B in the industry. In this new generation, inspired on our previous Beagle series we curated a model that balance nicely EQ and IQ. It was trained with some of the latest datasets including:
* Replete-AI/code_bagel_hermes-2.5
* mlabonne/orpo-dpo-mix-40k
* jondurbin/py-dpo-v0.1
Available in the hub fblgit/UNA-ThePitbull-21.4B-v2 and you can grab Quant versions sponsored by @bartowski at bartowski/UNA-ThePitbull-21.4B-v2-GGUF fully compatible with Ollama, llama.cpp, etc.

UNA
In this case we tried something new by alternating uniformity across layers of both MLP & Attention reducing computational requirements while keep a high performant result.

We trained him under these terms:
* ThePitbull-v1 as base: SFT maxLR 1e-4 minLR 5e-5 for 1 Epoch
* DPO maxLR 1e-4 minLR 5e-5 for 1 Epoch
You can continue the training by merely using 5e-5 maxLR and 0 warmup steps, it should minimize catastrophic forgetting of the model.

Remember if you do so, please include a Pitbull picture on your model and cite :) Have fun!
fblgit 
posted an update 10 months ago
view post
Post
Over the past week, I've been putting Claude through its paces, focusing primarily on productivity tasks (you know, the good old BAU – Business As Usual).

1. Python/Torch/Transformers/AI/ML
Right off the bat, I threw some complex AI/ML tasks at Claude, and I must say, it handled them with finesse. It even caught a few things that GPT missed! However, let's not get too carried away – we're not quite at the auto-code level just yet.

2. Brainstorming
This is where Claude falls a bit short. It seems to be more grounded than its competitors, which might not be ideal for generating novel ideas. If you're looking for a brainstorming partner, you might want to look elsewhere.

3. Attention
Despite the claims of super-large attention in the paper, Claude's "forgetting" mechanism seems to be more pronounced. It tends to miss entire chunks of information rather than just specific details like GPT does.

4. Following / Tasks
I hit a roadblock when Claude couldn't generate a LaTeX document. It's not the best at following complex, multi-step tasks.

5. Hallucinations
Oh boy, does Claude hallucinate! And when it does, it's on a whole new level of nonsense. The hallucinations seem to align with its grounded nature, making them even more convincing within the context of the prompt.

6. Sycophancy
Claude is quite the people-pleaser. I've found that using an adversarial brainstorming approach is more beneficial and time-efficient, as it forces me to highlight Claude's mistakes rather than letting it focus on being a sweet, pleasant minion.

7. Interface / UI
There's definitely room for improvement here. Basic features like stepping back on a prompt and stopping generation with the ESC key are missing. These are essential for extracting and composing content effectively.

Despite these limitations, I firmly believe that Claude is currently the #1
·
fblgit 
posted an update 10 months ago
view post
Post
Senku-70B stills undefeated within EQ-Bench, latest updates from the author shows even a further increase in performance, reaching a new score of 85.09

This new mark outperform some GPT-4 models, closing further the very thin gap between OpenCommunity LLM and Closed source models.

ShinojiResearch/Senku-70B-Full
  • 1 reply
·
fblgit 
posted an update 11 months ago
view post
Post
Introducing UNA-SimpleSmaug-34b:

Based on Smaug-34B-v0.1, capable of slightly outperform his base model and with increased math and reasoning thanks to simple-math dataset.
The model exhibits a great performance across diverse tasks with an excellent and balanced behaviour.
It scores 77.41 AVG on the Leaderboard, landing on #1 Position of 34B models.

Available in the hub already:
fblgit/UNA-SimpleSmaug-34b-v1beta
fblgit/simple-math

In this case, we applied UNA to the Attention Layers of the model while performing SFT with simple-math on a high complexity generated data of mathematics, proving the effect of simple-math on LLM's.
  • 2 replies
·
fblgit 
posted an update 11 months ago
view post
Post
Introducing model-similarities, a new simple tool to contrast two models

A straightforward yet insightful tool designed to shed light on the similarities between various models. Discover it now at [Model Similarity GitHub Repository](https://github.com/fblgit/model-similarity).

This project is in its nascent stages, and we're eager for contributions and enhancements. Crafted with simplicity at its core, the tool performs two primary comparisons:
- Weight similarities, utilizing a simple approach to contrast vector differences (A != B).
- Cosine similarity between the parameters of models A and B, providing a nuanced measure of their alignment.

Included in the repository are sample analyses and reports that validate model card claims, particularly regarding the training specifics of transformer components such as MLP, Attention, etc. Remarkably, these samples reveal 100% similarity scores between those parts of the models, pinpointing the exact base model utilized.

Join us in refining and expanding this tool. Whether you're looking to contribute code, ideas, or both, your input will help transform this into a resource for everyone.
fblgit 
posted an update 11 months ago
view post
Post
Presenting: SimpleMath

Recently we uploaded on the hub our LATEST and most powerful version of SimpleMath SFT dataset.
Today we are happy to present SimpleMath DPO Pairs, improving further mathematical capabilities on LLM's.

Our first results shows clear improvements on GSM8k, MATHQA, ARC, TQA, MMLU and BBH. Feel free to experiment and generate your own dataset, as we also provide the code to generate them synthetically.

fblgit/simple-math
fblgit/simple-math-DPO
fblgit/UNA-34BeagleSimpleMath-32K-v1
  • 2 replies
·