DriLLM Summarizer

Background

This is a fine-tuned model from Meta/Meta-Llama-3-8B. The model was fine-tuned with Volve DDR dataset using the Alpaca template, using Axolotl.

The motivation behind this model was to fine-tune an LLM that is capable of understanding the nuances of the Drilling Operations and provide 24-hour summarizations based on the inputs from Daily Drilling Reports hourly activities.

How to use

Sample Colab

Here's a Google colab notebook where you can get started with using the model

Recommended template for DriLLM-Summarizer:

TEMPLATE = """<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{instruction}


### Input:
{input}


### Response:

"""

Inferencing using Transformers Pipeline

The code below was tested on a Google colab (with the free T4 GPU).

import transformers
import torch

model_id = "bengsoon/DriLLM-Summarizer"

pipeline = transformers.pipeline(
    "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
)

TEMPLATE = """<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{instruction}


### Input:
{input}


### Response:

"""

INSTRUCTION = """You are a Rig Supervisor working at an oil and gas offshore drilling operation. \
Your company is currently on a drilling campaign and you are the on-site Drilling Engineer (DE). \
As a DE, one of your jobs is to oversee the operations at the drilling rigs. As such, you know the ins and outs of the operation, down to the hourly activities. \
Every day, activities are recorded either by the Driller, Mud Logger, MWD / LWD engineer or the Drilling Operations Coordinator throughout the day. \
As a DE representative for your company, you are required to prepare the 24-hour summary for the Daily Drilling Report (DDR) based on the hourly activities reported. \
You must always maintain the language of report along with the terminologies and mnemonics of the Drilling Engineer. \
Given the following activities for well XX, please prepare the 24-hour summary for the Daily Drilling Report (DDR). \
Only return the 24-hour summary, and nothing else.
"""

hourly_events = """00:00 - 11:00: Packed equipment and prepared for backload. Cleaned drillfloor and cantilever.
11:00 - 17:00: Performed are inspection with barge engineer. Cleaned and tidied offices and workspace. Demobilized all personell. End of operation
"""

input = TEMPLATE.format(instruction=INSTRUCTION, input=hourly_events)

output = pipeline(input)

print("Response: ", output[0]["generated_text"].split("### Response:")[1].strip())
# > Response:  Packed equipment and prepared for backload. Cleaned drillfloor and cantilever. Performed are inspection with barge engineer. Cleaned and tidyied offices and workspaces.

Quantized model

If you are facing GPU constraints, you can try to load it with 8-bit quantization

from transformers import BitsAndBytesConfig

pipeline = transformers.pipeline(
    "text-generation", 
    model=model_id, 
    model_kwargs = {
        "torch_dtype":  torch.bfloat16, 
        "quantization_config": BitsAndBytesConfig(load_in_8bit=True), # Uncomment to use 8-bit quantization, 
    },
    device_map="auto"
)

bengsoon
/

DriLLM-Summarizer