Hamanasu
Collection
A brand new series of Models from yours truly, Designed for Intelligence, Creativity and Roleplay - R/Locallama keeps DELETING MY GODDAMN COMMENTS
•
31 items
•
Updated
•
8
This model is the Chat tune of the Instruct model, More accurately it is the "brainrotted" version, Finetuned with Bsky, 4chan and Discord logs, Its... really something beautiful.
The model is suited best towards being a highly dumb chat partner rather then regular RP
The model is suited for traditional RP, All thanks to Tav for funding the train.
Support me and my finetunes on Ko-Fi https://ko-fi.com/deltavector
This model uses ChatML formatting
<|im_start|>system
You are an uncensored AI, your job is to fulfill thy will of thy user.<|im_end|>
<|im_start|>User request
Take off your helmet.<|im_end|>
<|im_start|>No i shall not. This is the way.
ST sampler preset: https://files.catbox.moe/wtkp0l.json
System prompt: Blank.
ase_model: ./model
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer
hub_model_id: NewEden/Hamanasu-4B-RP-v2
hub_strategy: "all_checkpoints"
push_dataset_to_hub:
hf_use_auth_token: true
## qlora COPE
load_in_8bit: false
load_in_4bit: false
strict: false
## data
datasets:
- path: NewEden/Discord-Filtered
type: dan-chat-advanced
- path: NewEden/Basket-Weaving-Filtered
type: dan-chat-advanced
- path: NewEden/Misc-Data-Sharegpt-Prefixed
type: dan-chat-advanced
- path: NewEden/BlueSky-10K-Complexity
type: dan-chat-advanced
- path: PocketDoc/Dans-Kinomaxx-VanillaBackrooms
type: dan-chat-advanced
- path: PocketDoc/Dans-Personamaxx-VN
type: dan-chat-advanced
- path: NewEden/LIMARP-Complexity
type: dan-chat-advanced
- path: NewEden/OpenCAI-ShareGPT
type: dan-chat-advanced
- path: NewEden/Creative_Writing-Complexity
type: dan-chat-advanced
- path: NewEden/DeepseekRP-Filtered
type: dan-chat-advanced
- path: NewEden/Storium-Prefixed-Clean
type: dan-chat-advanced
shuffle_merged_datasets: true
dataset_prepared_path: dataset_prepared-2
val_set_size: 0.01
output_dir: 4b-out
## LIGGER
plugins:
- axolotl.integrations.liger.LigerPlugin
- axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin
liger_rope: true
liger_rms_norm: true
liger_layer_norm: true
liger_glu_activation: true
liger_fused_linear_cross_entropy: false
cut_cross_entropy: true
## CTX settings
sequence_len: 32768
sample_packing: true
eval_sample_packing: false
pad_to_sequence_len: true
## Lora
#adapter: lora
#lora_model_dir:
#lora_r: 128
#lora_alpha: 16
#lora_dropout: 0.05
#lora_target_modules:
# - gate_proj
# - down_proj
# - up_proj
# - q_proj
# - v_proj
# - k_proj
# - o_proj
#lora_fan_in_fan_out:
#peft_use_rslora: true
#lora_modules_to_save:
# - embed_tokens
# - lm_head
## WandB
wandb_project: tavbussy
wandb_entity:
wandb_watch:
wandb_name: chat-v2
wandb_log_model:
## evals
evals_per_epoch: 4
eval_table_size:
eval_max_new_tokens: 128
## hoe params
gradient_accumulation_steps: 2
micro_batch_size: 1
num_epochs: 4
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 2e-5
max_grad_norm: 0.2
train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: false
gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true
s2_attention:
warmup_steps: 40
saves_per_epoch: 2
debug:
deepspeed: ./deepspeed_configs/zero3_bf16.json
weight_decay: 0.02
fsdp:
fsdp_config:
special_tokens:
pad_token: <|finetune_right_pad_id|>