Edit model card

Uploaded model

  • Developed by: Solshine (Caleb DeLeeuw)
  • License: apache-2.0
  • Finetuned from model : inceptionai/jais-adapted-7b-chat
  • Dataset: CopyleftCultivars/Natural-Farming-Real-QandA-Conversations-Q1-2024-Update (Real world Natural Farming advise, from over 12 countries and a multitude of real-world farm operations, using semi-synthetic data curated by domain experts)

V4 (best training loss curve of unsloth configs tested) of LORA adapter trained, merged into this full merged model: Solshine/jais-adapted-7b-chat-Natural-Farmer-lora-only-V4

Training Logs: ==((====))== Unsloth - 2x faster free finetuning | Num GPUs = 1 \ /| Num examples = 169 | Num Epochs = 2 O^O/ _/ \ Batch size per device = 2 | Gradient Accumulation steps = 4 \ / Total batch size = 8 | Total steps = 38 "-____-" Number of trainable parameters = 39,976,960 [38/38 03:29, Epoch 1/2] Step Training Loss 1 2.286800 2 2.205600 3 2.201700 4 2.158100 5 2.021100 6 1.820200 7 1.822500 8 1.565700 9 1.335700 10 1.225900 11 1.081000 12 0.947700 13 0.828600 14 0.830200 15 0.796300 16 0.781200 17 0.781600 18 0.815000 19 0.741400 20 0.847600 21 0.736600 22 0.714300 23 0.706400 24 0.752800 25 0.684600 26 0.647800 27 0.775300 28 0.613800 29 0.679500 30 0.752900 31 0.589800 32 0.729400 33 0.549500 34 0.638500 35 0.609500 36 0.632200 37 0.686400 38 0.724200

Merged model (printing trainable parameters; pytorch) readout: '''

LlamaForCausalLM(

(model): LlamaModel(

(embed_tokens): Embedding(64000, 4096)

(layers): ModuleList(

  (0-31): 32 x LlamaDecoderLayer(
  
    (self_attn): LlamaSdpaAttention(
    
      (q_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
      
      (k_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
      
      (v_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
      
      (o_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
      
      (rotary_emb): LlamaRotaryEmbedding()
    
    )
    
    (mlp): LlamaMLP(
    
      (gate_proj): Linear4bit(in_features=4096, out_features=11008, bias=False)
      
      (up_proj): Linear4bit(in_features=4096, out_features=11008, bias=False)
      
      (down_proj): Linear4bit(in_features=11008, out_features=4096, bias=False)
      
      (act_fn): SiLU()
    
    )
    
    (input_layernorm): LlamaRMSNorm((4096,), eps=1e-05)
    
    (post_attention_layernorm): LlamaRMSNorm((4096,), eps=1e-05)
  
  )

)

(norm): LlamaRMSNorm((4096,), eps=1e-05)

(rotary_emb): LlamaRotaryEmbedding()

)

(lm_head): Linear(in_features=4096, out_features=64000, bias=False)

)

'''

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

**JAIS Adapted 7B Chat Merged with V4 LORA adapters on Google Colab via: **

'''

Install required libraries

!pip install transformers peft huggingface_hub bitsandbytes accelerate

import os import torch from huggingface_hub import login from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel from google.colab import userdata

Hugging Face credentials

HF_TOKEN = userdata.get('HF_TOKEN') if HF_TOKEN is None: raise ValueError("HF_TOKEN not found in Colab secrets. Please set it up.")

login(token=HF_TOKEN)

Define model and LORA adapter paths

BASE_MODEL_NAME = "inceptionai/jais-adapted-7b-chat" LORA_MODEL_NAME = "Solshine/jais-adapted-7b-chat-Natural-Farmer-lora-only-V4" MERGED_MODEL_NAME = "Solshine/jais-adapted-7b-chat-Natural-Farmer-lora-merged-full"

Function to get the optimal device map

def get_device_map(model_size): if torch.cuda.is_available(): gpu_memory = torch.cuda.get_device_properties(0).total_memory / 1024**3 # Convert to GB if gpu_memory > model_size: return "auto" else: return { "": torch.device("cpu"), "lm_head": torch.device("cuda:0"), "model.embed_tokens": torch.device("cuda:0"), "model.norm": torch.device("cuda:0"), "model.layers.0": torch.device("cuda:0"), } return "cpu"

from transformers import BitsAndBytesConfig

Configure quantization

bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16 )

Download and load the base model

print("Downloading and loading the base model...") base_model = AutoModelForCausalLM.from_pretrained( BASE_MODEL_NAME, quantization_config=bnb_config, device_map="auto", torch_dtype=torch.float16, low_cpu_mem_usage=True, offload_folder="offload", offload_state_dict=True, )

Enable gradient checkpointing if available

if hasattr(base_model, 'gradient_checkpointing_enable'): base_model.gradient_checkpointing_enable() elif hasattr(base_model, 'model') and hasattr(base_model.model, 'gradient_checkpointing_enable'): base_model.model.gradient_checkpointing_enable()

tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL_NAME)

Download and load the LORA adapter

print("Downloading and loading the LORA adapter...") lora_model = PeftModel.from_pretrained(base_model, LORA_MODEL_NAME, device_map=device_map)

Merge the base model with the LORA adapter

print("Merging the base model with the LORA adapter...") merged_model = lora_model.merge_and_unload()

Save the merged model locally

print("Saving the merged model locally...") merged_model.save_pretrained("./merged_model", safe_serialization=True) tokenizer.save_pretrained("./merged_model")

Push the merged model to Hugging Face Hub

print("Pushing the merged model to Hugging Face Hub...") merged_model.push_to_hub(MERGED_MODEL_NAME, use_auth_token=HF_TOKEN) tokenizer.push_to_hub(MERGED_MODEL_NAME, use_auth_token=HF_TOKEN)

print("Process completed successfully!")

'''

Downloads last month
57
Safetensors
Model size
3.86B params
Tensor type
F32
FP16
U8
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Solshine/jais-adapted-7b-chat-Natural-Farmer-lora-merged-full