Djuunaa

djuna

AI & ML interests

None yet

Recent Activity

new activity 1 day ago
dnhkng/RYS-XLarge:RYS with Qwen2.5
View all activity

Organizations

Djuna Test Lab's profile picture

djuna's activity

New activity in dnhkng/RYS-XLarge 1 day ago

RYS with Qwen2.5

1
#5 opened about 2 months ago by
PSM272
reacted to Elizezen's post with 👀 1 day ago
view post
Post
2849
It turned out that the following simple method seems to be actually effective when you want to increase the appearance probability of only one or a very limited number of tokens.

import os

one_token = "♡" # Token to increase the appearance probability
value = 1000000

token = one_token * value

with open("one-token.txt", "w", encoding="utf-8") as f:
    f.write(token)


By training LoRA with unsloth based on the .txt file generated by the code above, you can increase the appearance probability of specific tokens while maintaining the model's performance to great extent. However, it's better to stop the training before train loss becomes 0.0, as it will start spamming the token once it appears even once. In general, you can stop training at a very early stage and it will still work.

It is also possible to reduce the appearance probability of specific tokens by creating an over-learned LoRA with the specific tokens you want to reduce, combining it with the model, and then creating a model that extracts only the difference using the chat vector method and subtracting it from an arbitrary model.

In this case, it is better to set the ratio of chat vector to about five times. It has very little effect on the overall performance, apart from the specific tokens.

new_v = v - (5.0 * chat_vector[i].to(v.device))
New activity in djuna/MN-Chinofun-12B-3 2 days ago

Some sample

1
#3 opened 2 days ago by
djuna
New activity in huggingchat/chat-ui 2 days ago

[MODELS] Discussion

525
#372 opened 10 months ago by
victor
reacted to fuzzy-mittenz's post with ❤️ 2 days ago
view post
Post
1352
So a cool thing happened,
Nomic/GPT4ALL released a "Reasoning/Thinking"(QwQ/o1/o3 type) Model using JavaScript functions to calculate things like the haversine function for distance between two places and so on, it's VERY cool the complex calculative/recursive AI in such a small package..

I was able to adapt their methods to one of my small models "Replicant" 2gb and created a new model with importance matrix Quantization using "THE_KEY" Dataset for better inference in the coding model I pulled from Whiterabbitneo's Qwen2.5 model... I give you Reasoning Rabbit.. enjoy

IntelligentEstate/o3-ReasoningRabbit_Q2.5-Cd-7B-IQ4_XS-GGUF
-IntelligentEstate/o3-ReasoningRabbit_Q2.5-Cd-7B-IQ4_XS-GGUF

IntelligentEstate/Replicant_Warder-o3-Q2.5_3B-iQ5_K_S-GGUF
IntelligentEstate/Replicant_Warder-o3-Q2.5_3B-iQ5_K_S-GGUF

-WhiteRabbitNeo/WhiteRabbitNeo-2.5-Qwen-2.5-Coder-7B

14B model detected as 7B

5
#1049 opened 3 days ago by
djuna
New activity in DontPlanToEnd/UGI-Leaderboard 3 days ago
liked a Space 3 days ago
New activity in djuna-test-lab/mergekit-slerp-ynuqykr 4 days ago

broken tokenizer?

#1 opened 4 days ago by
djuna