Jim Lai

grimjim

AI & ML interests

Experimenting primarily with 7B-12B parameter text completion models. Not all models are intended for direct use, but aim for research and/or educational purposes.

Recent Activity

posted an update 3 days ago

This recent paper points to an explanation for the unreasonable effectiveness of Frankenmerges: https://huggingface.co/papers/2502.05171 Specifically, the duplication of layers in Frankenmerges serves a purpose similar to what occurs in their recurrent-depth architecture. Successful frankenmerges that operate without additional fine-tuning are able to recover or "heal" from any damage due to abrupt transitions between layer blocks. Operational replicated layer blocks can provide functional benefits grounded in latent reasoning. Frankenmerges can also result in hybrid reasoning, by splicing together the latent reasoning of different models. Back in April 2024, I was able to duplicate a few layers in the Llama 3 8B model, turning it into a 9B model, without harming benchmarks significantly, despite any transition damage. https://huggingface.co/grimjim/llama-3-experiment-v1-9B My informal experimentation suggested that latent reasoning circuits could occupy continguous stacks of 2-4 layers, though the result was highly sensitive to the choice of transition location between layers.

new activity 3 days ago

open-llm-leaderboard/open_llm_leaderboard:Spurious `trust_remote_code=True` objection when submitting a model?

updated a model 3 days ago

grimjim/Magnolia-v5-12B

View all activity

Organizations

grimjim's activity

New activity in open-llm-leaderboard/open_llm_leaderboard 3 days ago

Spurious `trust_remote_code=True` objection when submitting a model?

#1100 opened 3 days ago by

grimjim

New activity in grimjim/DeepSauerHuatuoSkywork-R1-o1-Llama-3.1-8B 9 days ago

Adding Evaluation Results

#1 opened 9 days ago by

T145

New activity in google/gemma-2-2b-it 17 days ago

SLERP merge example code?

#20 opened 7 months ago by

grimjim

New activity in grimjim/SauerHuatuoSkywork-o1-Llama-3.1-8B 20 days ago

Adding Evaluation Results

#1 opened 20 days ago by

T145

New activity in FreedomIntelligence/HuatuoGPT-o1-8B 20 days ago

Please submit this model to the Open LLM Leaderboard

#1 opened about 1 month ago by

grimjim

New activity in grimjim/HuatuoSkywork-o1-Llama-3.1-8B about 1 month ago

Adding Evaluation Results

#1 opened about 1 month ago by

T145

New activity in anthracite-org/magnum-v4-27b 4 months ago

Adding Evaluation Results

#2 opened 4 months ago by

leaderboard-pr-bot

New activity in anthracite-org/magnum-v4-12b 4 months ago

Adding Evaluation Results

#3 opened 4 months ago by

leaderboard-pr-bot

New activity in anthracite-org/magnum-v2-72b 4 months ago

Adding Evaluation Results

#6 opened 4 months ago by

kirin7

New activity in anthracite-org/magnum-v3-27b-kto 5 months ago

Adding Evaluation Results

#4 opened 5 months ago by

CombinHorizon

New activity in grimjim/PAlign-PAPI-personality_prompt.json-cleaned 5 months ago

[bot] Conversion to Parquet

#1 opened 5 months ago by

parquet-converter

New activity in grimjim/llama-3-Nephilim-v2.1-8B 5 months ago

Adding Evaluation Results

#1 opened 5 months ago by

leaderboard-pr-bot

New activity in grimjim/llama-3-Nephilim-v2-8B 5 months ago

Adding Evaluation Results

#2 opened 5 months ago by

leaderboard-pr-bot

New activity in grimjim/Llama-3.1-8B-Instruct-abliterated_via_adapter-GGUF 5 months ago

fp16 version?

#2 opened 5 months ago by

Tycho-S

New activity in grimjim/Llama-3.1-8B-Instruct-abliterated_via_adapter 5 months ago

Adding Evaluation Results

#3 opened 5 months ago by

leaderboard-pr-bot

New activity in grimjim/Llama-3-Instruct-8B-SimPO-SPPO-Iter3-merge 5 months ago

Adding Evaluation Results

#1 opened 5 months ago by

leaderboard-pr-bot

New activity in grimjim/Llama-3-Instruct-8B-SPPO-Iter3-SimPO-merge 5 months ago

Adding Evaluation Results

#1 opened 5 months ago by

leaderboard-pr-bot

New activity in anthracite-org/magnum-v2-12b 5 months ago

Adding Evaluation Results

#8 opened 5 months ago by

leaderboard-pr-bot

New activity in anthracite-org/magnum-v3-34b 5 months ago

Adding Evaluation Results

#1 opened 5 months ago by

leaderboard-pr-bot

New activity in grimjim/Llama-3.1-8B-Instruct-abliterated_via_adapter-GGUF 6 months ago

q8 gives error in LM studio: "Checksum failed file corrupted"

#1 opened 6 months ago by

tazztone