|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- bengsoon/volve_alpaca |
|
language: |
|
- en |
|
base_model: |
|
- Meta/Meta-Llama-3-8B |
|
pipeline_tag: summarization |
|
tags: |
|
- oil-and-gas |
|
- energy |
|
- drilling |
|
--- |
|
# DriLLM Summarizer |
|
|
|
## Background |
|
This is a fine-tuned model from [Meta/Meta-Llama-3-8B](https://huggingface.co/Meta/Meta-Llama-3-8B). The model was fine-tuned with [Volve DDR dataset](https://huggingface.co/datasets/bengsoon/volve_alpaca) using the Alpaca template, using [Axolotl](https://github.com/axolotl-ai-cloud/axolotl). |
|
|
|
The motivation behind this model was to fine-tune an LLM that is capable of understanding the nuances of the Drilling Operations and provide 24-hour summarizations based on the inputs from Daily Drilling Reports hourly activities. |
|
|
|
## How to use |
|
### Sample Colab |
|
Here's a [Google colab notebook](https://colab.research.google.com/drive/10Txp14M-yeJG3hRAB8U2ydPrWFE1bypW?usp=sharing) where you can get started with using the model |
|
|
|
### Recommended template for DriLLM-Summarizer: |
|
``` python |
|
TEMPLATE = """<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. |
|
|
|
### Instruction: |
|
{instruction} |
|
|
|
|
|
### Input: |
|
{input} |
|
|
|
|
|
### Response: |
|
|
|
""" |
|
``` |
|
|
|
### Inferencing using Transformers Pipeline |
|
The code below was tested on a Google colab (with the free T4 GPU). |
|
|
|
``` python |
|
import transformers |
|
import torch |
|
|
|
model_id = "bengsoon/DriLLM-Summarizer" |
|
|
|
pipeline = transformers.pipeline( |
|
"text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto" |
|
) |
|
|
|
TEMPLATE = """<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. |
|
|
|
### Instruction: |
|
{instruction} |
|
|
|
|
|
### Input: |
|
{input} |
|
|
|
|
|
### Response: |
|
|
|
""" |
|
|
|
INSTRUCTION = """You are a Rig Supervisor working at an oil and gas offshore drilling operation. \ |
|
Your company is currently on a drilling campaign and you are the on-site Drilling Engineer (DE). \ |
|
As a DE, one of your jobs is to oversee the operations at the drilling rigs. As such, you know the ins and outs of the operation, down to the hourly activities. \ |
|
Every day, activities are recorded either by the Driller, Mud Logger, MWD / LWD engineer or the Drilling Operations Coordinator throughout the day. \ |
|
As a DE representative for your company, you are required to prepare the 24-hour summary for the Daily Drilling Report (DDR) based on the hourly activities reported. \ |
|
You must always maintain the language of report along with the terminologies and mnemonics of the Drilling Engineer. \ |
|
Given the following activities for well XX, please prepare the 24-hour summary for the Daily Drilling Report (DDR). \ |
|
Only return the 24-hour summary, and nothing else. |
|
""" |
|
|
|
hourly_events = """00:00 - 11:00: Packed equipment and prepared for backload. Cleaned drillfloor and cantilever. |
|
11:00 - 17:00: Performed are inspection with barge engineer. Cleaned and tidied offices and workspace. Demobilized all personell. End of operation |
|
""" |
|
|
|
input = TEMPLATE.format(instruction=INSTRUCTION, input=hourly_events) |
|
|
|
output = pipeline(input) |
|
|
|
print("Response: ", output[0]["generated_text"].split("### Response:")[1].strip()) |
|
# > Response: Packed equipment and prepared for backload. Cleaned drillfloor and cantilever. Performed are inspection with barge engineer. Cleaned and tidyied offices and workspaces. |
|
``` |
|
|
|
### Quantized model |
|
If you are facing GPU constraints, you can try to load it with 8-bit quantization |
|
|
|
``` python |
|
from transformers import BitsAndBytesConfig |
|
|
|
pipeline = transformers.pipeline( |
|
"text-generation", |
|
model=model_id, |
|
model_kwargs = { |
|
"torch_dtype": torch.bfloat16, |
|
"quantization_config": BitsAndBytesConfig(load_in_8bit=True), # Uncomment to use 8-bit quantization, |
|
}, |
|
device_map="auto" |
|
) |
|
``` |