A Finetuned Bloom 1b1 Model for Sequence Classification

The model was developed as a personal learning experience to fine tune a ready language model for Text Classification and to use it on real life data from the internet to perform sentiment analysis.

It has been generated using this raw template.

Model Details

The model achieves the following scores on the evaluation set during the fine tuning:

Here is the train/ eval/ test split:

DatasetDict({
    train: Dataset({
        features: ['review', 'sentiment'],
        num_rows: 36000
    })
    test: Dataset({
        features: ['review', 'sentiment'],
        num_rows: 5000
    })
    eval: Dataset({
        features: ['review', 'sentiment'],
        num_rows: 9000
    })
})

Model Description

Developed by: Snoop088
Model type: Text Classification / Sequence Classification
Language(s) (NLP): English
License: Apache 2.0
**Finetuned from model: bigscience/bloom-1b1

Model Sources [optional]

Repository: https://huggingface.co/snoop088/imdb_tuned-bloom1b1-sentiment-classifier/tree/main
Paper [optional]: [More Information Needed]
Demo [optional]: [More Information Needed]

Uses

The model is intended to be used for Text Classification.

Direct Use

Example script to use the model. Please note that this is peft adapter on the Bloom 1b model:

DEVICE = "cuda:0" if torch.cuda.is_available() else "cpu"
model_name = 'snoop088/imdb_tuned-bloom1b1-sentiment-classifier'
loaded_model = AutoModelForSequenceClassification.from_pretrained(model_name, 
                                                                  trust_remote_code=True, 
                                                                  num_labels=2,
                                                                  device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token

my_set = pd.read_csv("./data/df_manual.csv")

inputs = tokenizer(list(my_set["review"]), truncation=True, padding="max_length", max_length=256,  return_tensors="pt").to(DEVICE)
outputs = loaded_model(**inputs)
outcome = np.argmax(torch.Tensor.cpu(outputs.logits), axis=-1)

[More Information Needed]

Downstream Use [optional]

The purpose of this model is to be used to perform sentiment analysis on a dataset similar to the one by IMDB. It should work well on product reviews, too in my opinion.

[More Information Needed]

Out-of-Scope Use

[More Information Needed]

Bias, Risks, and Limitations

[More Information Needed]

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training Data

Training is done on the IMDB dataset available on the Hub:

imdb

[More Information Needed]

Training Procedure

training_arguments = TrainingArguments(
    output_dir="your_tuned_model_name",
    save_strategy="epoch",
    per_device_train_batch_size=4,
    per_device_eval_batch_size=4,
    gradient_accumulation_steps=4,
    optim="adamw_torch",
    evaluation_strategy="steps",
    logging_steps=5,
    learning_rate=1e-5,
    max_grad_norm = 0.3,
    eval_steps=0.2,
    num_train_epochs=2,
    warmup_ratio= 0.1,
    # group_by_length=True,
    fp16=False,
    weight_decay=0.001,
    lr_scheduler_type="constant",
)

peft_model = get_peft_model(model, LoraConfig(
                            task_type="SEQ_CLS",
                            r=16,
                            lora_alpha=16,
                            target_modules=[
                                'query_key_value',
                                'dense'
                            ],
                            bias="none",
                            lora_dropout=0.05, # Conventional
                        ))

LORA results in: trainable params: 3,542,016 || all params: 1,068,859,392 || trainable%: 0.3313827830405592

Preprocessing [optional]

Simple preprocessing with DataCollator:

def process_data(example):
    item = tokenizer(example["review"], truncation=True, max_length=320) # see if this is OK for dyn padding
    item["labels"] = [ 1 if sent == 'positive' else 0 for sent in example["sentiment"]]
    return item

tokenised_data = tokenised_data.remove_columns(["review", "sentiment"])
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

Training Hyperparameters

Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Evaluation function:

import evaluate

def compute_metrics(eval_pred):
    # All metrics are already predefined in the HF `evaluate` package
    precision_metric = evaluate.load("precision")
    recall_metric = evaluate.load("recall")
    f1_metric= evaluate.load("f1")
    accuracy_metric = evaluate.load("accuracy")

    logits, labels = eval_pred # eval_pred is the tuple of predictions and labels returned by the model
    predictions = np.argmax(logits, axis=-1)
    precision = precision_metric.compute(predictions=predictions, references=labels)["precision"]
    recall = recall_metric.compute(predictions=predictions, references=labels)["recall"]
    f1 = f1_metric.compute(predictions=predictions, references=labels)["f1"]
    accuracy = accuracy_metric.compute(predictions=predictions, references=labels)["accuracy"]
    # The trainer is expecting a dictionary where the keys are the metrics names and the values are the scores. 
    return {"precision": precision, "recall": recall, "f1-score": f1, 'accuracy': accuracy}

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: [More Information Needed]
Hours used: [More Information Needed]
Cloud Provider: [More Information Needed]
Compute Region: [More Information Needed]
Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

Model: 6.183.1 "13th Gen Intel(R) Core(TM) i9-13900K"
GPU: Nvidia RTX 4900/ 24 GB
Memory: 64 GB

Software

python 3.11.6
transformers 4.36.2
torch 2.1.2
peft 0.7.1
numpy 1.26.2
datasets 2.16.0

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

snoop088
/

imdb_tuned-bloom1b1-sentiment-classifier

A Finetuned Bloom 1b1 Model for Sequence Classification

Model Details

Model Description

Model Sources [optional]

Uses

Direct Use

Downstream Use [optional]

Out-of-Scope Use

Bias, Risks, and Limitations

Recommendations

How to Get Started with the Model

Training Details

Training Data

Training Procedure

Preprocessing [optional]

Training Hyperparameters

Speeds, Sizes, Times [optional]

Evaluation

Testing Data, Factors & Metrics

Testing Data

Factors

Metrics

Results

Summary

Model Examination [optional]

Environmental Impact

Technical Specifications [optional]

Model Architecture and Objective

Compute Infrastructure

Hardware

Software

Citation [optional]

Glossary [optional]

More Information [optional]

Model Card Authors [optional]

Model Card Contact

Dataset used to train snoop088/imdb_tuned-bloom1b1-sentiment-classifier