Fine-Tuned Llama Model for Metallurgy and Materials Science

Developed by: Abdulrhman37
License: Apache-2.0
Base Model: unsloth/meta-llama-3.1-8b-bnb-4bit

This fine-tuned Llama model specializes in metallurgy, materials science, and engineering. It has been enhanced to provide precise and detailed responses to technical queries, making it a valuable tool for professionals, researchers, and enthusiasts in the field.

🛠️ Training Details

This model was fine-tuned with:

Unsloth: Enabled 2x faster training using efficient parameter optimization.
Hugging Face TRL: Used for advanced fine-tuning and training capabilities.

Fine-tuning focused on enhancing domain-specific knowledge using a dataset curated from various metallurgical research and practical case studies.

For a detailed walkthrough of the fine-tuning process, refer to this notebook.

🔑 Features

Supports text generation with scientific and technical insights.
Provides domain-specific reasoning with references to key metallurgical principles and mechanisms.
Built for fast inference with bnb-4bit quantization for optimized performance.

🌟 Example Use Cases

Material property analysis (e.g., "How does adding rare earth elements affect magnesium alloys?").
Failure mechanism exploration (e.g., "What causes porosity in gas metal arc welding?").
Corrosion prevention methods (e.g., "How does cathodic protection work in marine environments?").

📦 How to Use

Install Dependencies:

%%capture
!pip install unsloth

!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git

Load the model:

metallurgy_prompt = """You are a highly knowledgeable assistant specializing in metallurgy, materials science,
and engineering. Below is a technical instruction.Your task is to provide an accurate, domain-specific response that appropriately addresses the request.
Ensure Your response is detailed,Provide scientifically rigorous and quantitative responses,Reference fundamental principles and mechanisms,
Include potential equations, calculations, or microstructural insights where relevant,Support statements with scientific reasoning,
Discuss potential variations or alternative interpretations


### Instruction:
{}

### Input:
{}

### Response:
{}"""

from unsloth import FastLanguageModel
import torch
max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.

if True:
    from unsloth import FastLanguageModel
    model, tokenizer = FastLanguageModel.from_pretrained(
        model_name = "Abdulrhman37/lora_model", # YOUR MODEL YOU USED FOR TRAINING
        max_seq_length = max_seq_length,
        dtype = dtype,
        load_in_4bit = load_in_4bit,

    )
    FastLanguageModel.for_inference(model) # Enable native 2x faster inference

Use the fine tunned model:

# function tp process question 
def answer(q: str):
    """
    Generates a detailed response to a metallurgy-related question using a pre-trained language model.

    Args:
        q (str): The question or instruction to be answered.

    Returns:
        str: The generated response from the model, specifically the content after "### Response:".
    """

    # Initialize the language model for fast inference
    FastLanguageModel.for_inference(model)  # Enables 2x faster native inference

    # Format the input question using the metallurgy prompt template
    inputs = tokenizer(
        [
            metallurgy_prompt.format(
                q,  # Instruction: The main question
                "",  # Input: Empty for now as no specific input is provided
                ""   # Output: Placeholder for the generated response
            )
        ],
        return_tensors="pt"  # Return input tensors
    ).to("cuda")  # Transfer tensors to GPU for faster computation

    # Generate the model's output based on the formatted input
    outputs = model.generate(**inputs, use_cache=True)  # Use cached values to speed up decoding

    # Decode the model's output into readable text
    result = tokenizer.batch_decode(outputs)

    # Split the result into sections before and after "### Response:"
    split_content = result[0].split("### Response:")
    before_response = split_content[0].strip()  # Extract content before "Response"
    after_response = split_content[1].strip().replace('<|end_of_text|>', '')  # Clean up response content

    # Prepare a detailed response dictionary for debugging or additional processing
    detailed = {
        'after_response': after_response,  # The main content of the generated response
        'before_response': before_response,  # Metadata or introductory content before the response
        'full_result': result  # The full raw output from the model
    }

    # Return only the generated response content
    return detailed['after_response']


# asking model a technical question 
q="To improve strength, toughness, and shock-resistance in Mg-Al-Mn system cast magnesium alloys (e.g. AM100A),what should I do ?"

from pprint import pprint
pprint(answer(q))

follow this notebook for help to use the model

📧 Contact

For any inquiries, feedback, or collaboration opportunities, feel free to reach out:

Email: abdodebo3@gmail.com
LinkedIn
GitHub
Phone: +20 1026821545

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Abdulrhman37
/

lora_model